Candidates applying for AI Engineer roles take an average of 7 days to get hired, when considering 1 user submitted interviews for this role. To compare, the hiring process at Axya overall takes an average of 7 days.
Common stages of the interview process at Axya as a AI Engineer according to 1 Glassdoor interviews include:
I applied online. The process took 1 week. I interviewed at Axya (Montreal, QC) in May 2025
Interview
Interview process concerns: Second round required presentation + diagrams + prototype - excessive time commitment for candidates. Homework assignments appear related to actual company problems, which feels like unpaid consulting work. There are more ethical ways to evaluate technical skills without exploiting candidate time.
Interview questions [2]
Question 1
Axya has built an industrial procurement platform with many diverse types of customers. Each of
these customers have also a diverse pool of supplier who all have their unique ways of sending
quotations or other kind of procurement information in PDFs.
Each supplier’s follows a stable format per supplier but varies across suppliers.
Challenge:
1. Automatically extract structured quote fields (part numbers, unit prices, quantities,
delivery dates, payment terms) from heterogeneous PDF documents.
2. Provide a queryable service endpoint that returns normalized quotes in JSON.
Key Requirements:
● OCR & Layout Analysis: Propose OCR engines (e.g., Amazon Textract, Tesseract,
LayoutLM) and strategies to detect table/grid structures.
● LLM Integration: Outline how you would use a pre-trained LLM (or fine-tune) to correct,
normalize, and validate extracted text and map to schema.
● Scalability & Fault Tolerance: Design for high throughput and intermittent failures using
AWS primitives.
● MLOps Pipeline: Define CI/CD for pipeline updates, model versioning, automated
testing, and performance monitoring (e.g., SageMaker Pipelines, CloudWatch).
● Deliverable Service: A RESTful API or microservice specification that ingests a PDF
URL (or S3 URI) and returns a JSON payload of extracted fields.
The platform has thousands of aerospace suppliers with structured attributes (capacities,
certifications) and unstructured documents attached to them (HTML pages, PDFs). All of this
information has some commonalities, but a lot fo what makes each of these companies
successfully doesn’t necessarily fit a common schema.
A buyer for an aerospace company should be able to communicate a need in plain language
and receive a list of suppliers that match its requirements and the context surrounding the
request.
Note: The current system uses full-text ElasticSearch, and you can test it out here:
https://axya.co/suppliers_directory?page=0
Challenge:
1. Index structured and unstructured data into a unified semantic search solution to answer
capability queries (e.g., "CNC machining for titanium aerospace parts").
2. Make sure that part of the query that is deterministic gets treated as such (i.e. specific
certification required or geolocalisation of the suppliers).
Key Requirements:
● Data Ingestion & Preprocessing: Describe ETL for structured tables and document
parsing (PDF, HTML), metadata extraction, and cleaning.
● Embedding & Vector Store: Choose embedding models (e.g., OpenAI embeddings,
Sentence Transformers) and vector database architecture.
● “RAG” Pipeline: Illustrate how a retrieval layer and LLM can be combined to answer
free‐text queries with structured output (e.g., top-N supplier list with relevancy scores).
● Cloud Deployment: Architect an AWS-based solution for indexing, query API, and
autoscaling.
● MLOps & Monitoring: Propose a CI/CD process for retraining embeddings (if needed),
refreshing indexes, and tracking query performance and drift.
Note 1: Whenever possible, we much prefer to reuse existing technologies than to add new
ones.
Note 2: all of the information collected and used for indexing are public information from
suppliers.
Deliverables
1. Slide Deck: 12–15 slides covering both projects end-to-end.
2. Architecture Diagrams: Detailed AWS diagrams for each system’s components, data
flows, and failover strategies.
3. Code Snippets / Pseudocode: Examples of key modules (e.g., data ingestion, model
inference, CI pipeline definitions).
4. Security & Compliance Notes: Brief discussion on data privacy and access controls
(when necessary).
5. (bonus) Optional Prototype: If time permits, a minimal proof‐of‐concept (e.g., Jupyter
notebook or small Lambda function).