Software Engineer
LLM.co is seeking an experienced AI Software Engineer to help design, fine-tune, and deploy large language models (LLMs) for secure, enterprise-grade applications. This role blends cutting-edge AI development with practical implementation, focusing on private, compliant solutions for industries like law, finance, and healthcare.
We are seeking a skilled AI Software Engineer with deep expertise in large language models, machine learning, and generative AI. In this role, you’ll contribute to the design, development, and deployment of LLM systems tailored for real-world enterprise applications. You'll work closely with our research, engineering, and product teams to turn advanced AI models into practical, usable tools.
Key Responsibilities
- Design and develop software pipelines for fine-tuning, training, and deploying LLMs (e.g., LLaMA, Mistral, GPT-J, Claude, etc.)
- Implement secure, private, and efficient inference systems using open-source or proprietary model architectures
- Build APIs and infrastructure for model serving, prompt engineering, vector search, and retrieval-augmented generation (RAG)
- Optimize model performance for latency, token limits, and memory efficiency across cloud and on-prem environments
- Collaborate with domain experts (legal, finance, healthcare) to embed domain knowledge into models using transfer learning, embeddings, and supervised tuning
- Integrate models into secure enterprise workflows with access control, audit logging, and compliance
- Stay current with cutting-edge LLM and ML research, helping guide technical direction and model selection
- Contribute to internal tooling and developer workflows for training, evaluation, and deployment
Requirements
- Bachelor's or Master’s degree in Computer Science, AI/ML, or related field
- 3+ years of experience in AI/ML software engineering, preferably with LLMs or NLP models
- Hands-on experience with PyTorch, TensorFlow, HuggingFace Transformers, LangChain, or similar frameworks
- Strong knowledge of model tuning (LoRA, QLoRA, PEFT), quantization, and prompt engineering
- Familiarity with vector databases (e.g., Pinecone, Weaviate, FAISS, Qdrant) and RAG systems
- Comfortable with Python (required); experience with Rust, Go, or TypeScript is a plus
- Experience with Docker, Kubernetes, and cloud platforms (AWS, GCP, Azure)
- Security- and privacy-minded; experience with on-prem or air-gapped deployments is a major bonus
- Passion for solving hard problems in generative AI with pragmatic engineering
Preferred Qualifications
- Experience in legaltech, fintech, or healthtech environments
- Contributions to open-source LLM or ML projects
- Experience working with knowledge graphs, structured data, and embeddings
- Familiarity with agentic AI, multi-agent systems, or orchestration frameworks like LangGraph
What We Offer
- Competitive salary + equity options
- Flexible remote-first work culture
- Opportunity to work on next-gen AI systems in high-trust industries
- Access to cutting-edge hardware and GPU clusters
- Chance to define the trajectory of secure, private AI adoption