Machine Learning-Big Data Engineer in New York, NY at Open Systems Technologies

Date Posted: 10/19/2019

Job Snapshot

Job Description

A global investment firm is seeking a Machine Learning/Big Data Engineer to join their team in New York, NY.


  • Create and maintain optimal data pipeline architecture
  • Assemble large, complex data sets that meet functional/non-functional business requirements
  • Identify, design, and implement internal process improvements: automating manual processes, optimizing data delivery, re-designing infrastructure for greater scalability, etc.
  • Build infrastructure required for optimal extraction, transformation, and loading of data from a wide variety of data sources using SQL and AWS big data technologies
  • Build analytics tools that utilize the data pipeline to provide actionable insights into customer acquisition, operational efficiency and other key business performance metrics
  • Work with stakeholders from both the business and technology to assist with data-related technical issues and support their data infrastructure needs
  • Create data and analytical tools for internal customers that assist them in building and optimizing our data organization into an innovative industry leader
  • Work with data and analytics experts to strive for greater functionality in our data systems


  • Must have at least a Bachelor’s degree in Computer Science or related field
  • 3+ years experience with Masters or 5+ years with Bachelors degree in Computer Science, Statistics, Informatics, Information Systems or another quantitative field
  • Experience with object-oriented/object-function languages: Python, Java, Scala
  • Experience with big data tools: Hadoop, Spark, Presto, Kafka
  • Experience with relational SQL and NoSQL databases, including SQL Server, MySQL and Cassandra
  • Experience with data pipeline and workflow management tools: Airflow, Luigi
  • Experience with AWS cloud services: EC2, EMR, RDS, Redshift, Athena, SQS
  • Knowledge of ML (e.g. logistic regression, random forest, xg boost, neural networks, etc.) and experience with ML tools (e.g., scikit-learn, sparkML, tensorflow, keras, etc.) is preferred
  • Savvy with data science stack (Pandas, NumPy, SciPy)
  • Data Science/Analysis background; Proficient at working with large datasets
  • Unix/Linux command-line experience
  • Experience working with AWS, GCP or Azure
  • Broad understanding of equities, derivatives, futures, FX, or other financial-services instruments
  • Excellent listening, and communication (both oral and written) skills
  • Self-starter and critical thinker, takes ownership of own projects and makes improvement suggestions for the entire infrastructure
  • Able to independently and in a collaborative environment
  • Able to handle several projects with different priorities at the same time in a fast-paced environment
  • Excellent self-management and problem-solving skills
  • Knowledge of Agile/Scrum methodologies preferred
Job keywords: