Back to all jobs

Senior Data Scientist NLP/GenAI - Catalog

Senior Data Scientist NLP/GenAI
WeWorkRemotely
Apply NowSign in to track
AI-enhanced for better readability
Company logo

Labs: Senior Data Scientist NLP/GenAI - Catalog

Headquarters: Remote - France URL: http://mirakl.fr

About Mirakl

Mirakl is a leading provider of e-commerce software solutions. They offer businesses a unique suite of solutions to transform their digital activities and accelerate growth.

Since 2012, Mirakl has been assisting B2C and B2B companies with advanced, secure, and scalable technology. This technology enables them to digitize their activities, expand their offerings through marketplaces or dropshipping, streamline catalog and payment management for suppliers, offer personalized shopping experiences, and increase profits through retail media.

Mirakl partners with over 450 leading companies worldwide, including Airbus, Maisons du Monde, Decathlon, H&M, Sonepar, and Toyota Material Handling. For more information, visit www.mirakl.fr.

About Mirakl Labs

Mirakl Labs, the technical and product teams, are primarily located in Paris and Bordeaux. They collaborate daily to address the challenges of clients and users by tackling issues related to new features, scalability, security, and ergonomics.

They operate in an agile mode, organized into Squads composed of a Squad Lead, 5 developers, a Product Manager, and a QA. Each Squad specializes in a functional scope to design and implement new features, their evolutions, and APIs (with a micro-services breakdown). Infrastructure, Architecture, Security, Documentation, Product Design, Data, and Support teams operate transversally, providing their expertise and coherence across all products.

All teams are responsible for their scope, and each employee contributes their experience and ideas. Innovation, feedback, and involvement in decision-making are at the heart of their philosophy.

To foster this sharing with other enthusiasts, they sponsor, speak at, and host various events, meetups, and associations in the French Tech scene. In recent years, they have participated in events such as Devoxx, ReactEurope, ProductConf, and Flupa UX Days.

About the Job

As a Data Scientist, your primary mission will be to prototype, iterate, and deploy algorithms in collaboration with Product teams, Data Engineers, and development teams.

Your projects will focus on Marketplace catalog challenges, including NLP, Computer Vision, and the use of Generative AI (custom LLMs) on a large scale. The topics you will address will have a significant impact on clients: the ambition is to make the best use of rich and varied data to increase their turnover, optimize the management of their marketplace, and guarantee the security of users and transactions.

We are looking for permanent (CDI) employees, based in Paris, Bordeaux, or fully remote.

Catalog Topics:

  • Automatic rewriting of marketing content based on business expectations
  • Extraction of product attributes from images and free text
  • Detection of product variants
  • Product categorization
  • Automatic onboarding of vendor products
  • Merging product sheets from multiple sources
  • Prediction of trending products

What's in it for you:

  • Implement algorithms that will have a visible impact on over 500 e-commerce/marketplace sites in 40 countries, some with very large volumes (millions of products, customers, and orders per year)
  • Varied cutting-edge techniques (multimodal models, fine-tuning of LLMs, etc.). Mirakl is one of the few French players to have fine-tuned LLMs in production on a large scale. Join them to continue cultivating this pioneering spirit
  • Real autonomy and responsibility in the projects you own

Stack and Tools

Python, Tensorflow, Pytorch, Hugging Face, Databricks, Spark, AWS (Amazon Redshift, s3, etc.), SQL, Airflow, Delta Lake. Specific LLM: Autotrain, Unsloth, Galileo, LangChain, Anyscale.

Daily Responsibilities:

  • Analyze, prepare data, and prototype algorithms
  • Deploy them in collaboration with Data Engineers and development teams
  • Create dashboards to illustrate the relevance of the algorithms and monitor production
  • Present results at the weekly data science meeting and participate in team brainstorming sessions
  • Exchange with other teams to refine use cases, user experience, and integration methods

You will enjoy this job if:

  • You have a minimum of 4 years of experience as a Data Scientist, with significant experience in NLP and ML applied in a company
  • You have already put Machine Learning algorithms into production
  • You have a good knowledge of NLP and Computer Vision algorithms and State-Of-the-Art architectures - for example, Transformers. (Having knowledge of the latest LLMs is a plus)
  • You master Python, Tensorflow, and/or PyTorch
  • You have experience in Spark development
  • You are pragmatic, data-driven, and business-oriented
  • You like to have ownership of your subjects and are autonomous and have a very good team spirit
  • You have a positive attitude: respect and kindness are part of your values
  • You enjoy sharing your work in internal presentations, conferences, or by writing articles

Hiring Process:

  • A 30-minute phone call with one of their Tech recruiters. This will be an opportunity to discuss your background, your expectations, and discover what Mirakl can offer you in return.
  • A first technical exchange by Zoom for 30 minutes with a member of the Data Science team, which will allow you to delve into more concrete aspects of your expertise and discover how your skills can be integrated into their projects.
  • A practical case to be done at home.
  • A presentation and technical exchange with a Data Science team manager for 75 minutes.
  • A final 1-hour Zoom exchange with future Mirakl colleagues around their values and company culture.

Mirakl is committed to diversity, equal opportunities, and inclusion. They celebrate their differences because they are convinced that the visible and invisible qualities of each Mirakl Worker are a source of strength and innovation. As part of this commitment, they consider all applications without distinction of: gender, ethnicity, religion, sexual orientation, disability, age, or any other characteristic protected by law.

To apply: https://weworkremotely.com/remote-jobs/labs-senior-data-scientist-nlp-genai-catalog

Similar jobs