JobsRemoteList
fal logo

Software Engineer, Distributed Systems

fal·May 14, 2026·0 views
🌍 On-site · TurkeyFull-time

💰 $120,000 – $180,000/yr

Job Description

About This Role

fal is seeking an experienced Software Engineer specializing in Distributed Systems to join our platform engineering team. You will be responsible for designing, building, and scaling our core infrastructure that powers large-scale AI workload orchestration and compute distribution. This is a high-impact position for a senior engineer who thrives on solving complex technical challenges in production environments handling massive traffic and data volumes.

As a Distributed Systems Engineer at fal, you are an experienced software engineer who excels at building large-scale computing platforms. You possess deep expertise in distributed systems architecture that manages high complexity, handles significant traffic and data at scale, and you understand how to achieve reliability and operational efficiency with minimal overhead. Your background demonstrates a proven ability to design systems that maintain performance under extreme load while remaining maintainable and observable.

Key Responsibilities

  • Build and maintain the core Python/Rust platform infrastructure including request routing, AI workload orchestration, intelligent scheduling algorithms, GPU autoscaling systems, large-scale distributed file storage, and enterprise-grade queueing systems
  • Produce forward-looking architectural designs for platform evolution as fal scales to handle 100x current traffic while maintaining low latency globally across all regions
  • Leverage artificial intelligence extensively to automate and optimize the complex aspects of building reliable distributed systems, reducing manual operational burden
  • Profile, analyze, and optimize low-level CPU and memory performance across the entire platform stack
  • Collaborate with cross-functional teams to define technical standards and drive architectural decisions that support long-term scalability
  • Contribute to system observability and monitoring infrastructure to ensure visibility into platform health and performance

Required Qualifications

  • 5+ years of professional experience building and maintaining distributed compute and orchestration platforms using Python, Rust, or similar systems programming languages
  • Strong foundational understanding of distributed systems theory including consensus algorithms, scheduling theory, fault tolerance mechanisms, and capacity planning strategies
  • Deep knowledge of computational complexity analysis and memory allocation patterns in large-scale systems
  • Demonstrated track record of designing and shipping systems that perform reliably under real production workloads at significant scale
  • Hands-on experience building comprehensive observability solutions (logging, metrics, tracing) and using observability data to drive performance and reliability improvements
  • Excellent written and verbal communication skills with ability to influence technical decisions across multiple teams and stakeholder groups
  • Self-starter mentality with ability to execute quickly, take full ownership of projects, and continuously seek process and technical improvements

Nice-to-Have Qualifications

  • Production experience with AI/ML inference platforms, training infrastructure, or GPU workload management systems
  • Background in high-performance systems programming including async runtimes, zero-copy patterns, and memory-safe concurrent programming models
  • Experience designing and operating multi-tenant compute platforms with strong isolation and resource management
  • Deep understanding of networking fundamentals, TCP/IP performance characteristics, and load balancing strategies
  • Familiarity with GPU workload characteristics, scheduling constraints, and GPU cluster management systems

What fal Offers

  • Intellectually challenging and meaningful work on cutting-edge distributed systems infrastructure
  • Extensive learning and professional growth opportunities through exposure to complex technical problems and modern systems architecture
  • Regular team events and company offsites fostering collaboration and company culture
  • Opportunity to work with a talented team building the future of AI compute infrastructure

Location & Work Arrangement

This position is based in Turkey. Please note the specific location requirements and any visa sponsorship details by contacting fal directly through their careers page.

💰 Compensation not publicly listed. Market estimate for similar roles: from $120K, varying by experience and location.