Descripción del puesto
About the Role
We are building a global Data Science hub in Colombia and seek an Analyst II for the Model Automation & Production team. This hands-on role supports end-to-end model operations for global partners. It requires solid modeling experience, autonomy, and proven ability to deliver reproducible, production-ready solutions using modern MLOps practices. What you will do
Develop, validate, and operationalize predictive models (GLMs, tree-based, and other ML methods) to solve business problems across insurance areas.
Work hands-on with large structured and unstructured datasets for feature engineering, model building, and diagnostics.
Design and run controlled experiments in Databricks (Databricks Experiments), track runs, analyze results, and promote reproducible workflows.
Implement experiment and model versioning using GitHub for code/version control and experiment-tracking tools (MLflow or equivalent) for reproducibility.
Build model monitoring, drift detection, and automated retraining triggers to maintain production performance.
Produce interpretable model explanations (SHAP, PDP/ICE) and translate results into actionable recommendations for stakeholders.
Collaborate closely with global technical teams and business partners, delivering clear documentation, code reviews, and handoffs for production deployment.
Key Responsibilities:
Apply data science techniques to explore, manipulate, and analyze large structured and unstructured datasets to uncover insights and support business decisions.
Formulate and test hypotheses with statistical rigor, ensuring reliability and relevance of analytical outcomes.
Translate complex quantitative findings into clear visualizations and compelling narratives for diverse audiences, including business stakeholders.
Work in analyses and model tasks of moderate complexity end-to-end, including EDA, feature engineering, modeling, validation, and deployment support.
Implement reproducible pipelines and adhere to software engineering best practices (version control, code reviews, unit tests, CI/CD where applicable).
Maintain clear, business-focused communication with stakeholders, producing concise dashboards/reports and decision-ready insights.
Stay current with Generative AI/LLM techniques and apply them where they add value.
Engage with the global Data Science community, participating in cross-functional initiatives and knowledge-sharing forums to promote innovation and best practices.
Skills and Experience:
Fluency in English C1 (Must)
Bachelor’s degree in Statistics, Mathematics, Computer Science, Economics, Actuarial Science, or related quantitative field. Master’s preferred or equivalent industry experience.
5-8+ years of professional experience in data science, with proven experience building and validating predictive models.
Proficiency in Python and SQL for data manipulation and modeling.
Proficiency with Databricks and Databricks Experiments (running experiments, analyzing runs, reproducible code). Experience with Git for version control and collaborative workflows.
Hands-on experience with AWS services (e.g., S3, EC2, and related) prefer
red.Demonstrated ability to work independently, manage multiple priorities, and deliver with minimal supervision.
Strong analytical thinking, statistical rigor, and ability to explain complex quantitative results to non-technical stakeholders.
High autonomy and ownership: able to scope and drive projects end-to-end.
Strong communicator: able to present concise, business-focused arguments and recommendations. Proven skill in translating data science solutions into clear, measurable business actions and in communicating their value to stakeholders
Collaborative and coachable: contributes to team practices and learns quickly from feedback.
Highly desirable / nice to have
Experience with Generative AI / LLMs (prompt engineering, vector embeddings, RAG, evaluation and mitigation strategies for hallucinations).
Familiarity with MLflow or similar experiment and model tracking systems.
Experience implementing model monitoring, feature/label drift detection, and automated retraining pipelines.
Knowledge of actuarial/insurance modeling concepts and domain experience.