DescriptionStrength Through Diversity
Ground breaking science. Advancing medicine. Healing made personal.
Roles & Responsibilities:
The Data Engineer/Software Developer participates in full life cycle application development; designs and support complex scientific software development and deployment for basic and clinical research studies. This individual interacts with researchers and provides technical expertise and develops successful solutions.
This person’s primary role will be to build and to extend web-based systems for data quality and site performance monitoring, and to work with the team to allow researchers to explore and analyze data from a newly funded large-scale U01 grant funded through NIMH’s Individually Measured Phenotypes to Advance Computational Translation in Mental Health (IMPACTMH; https://www.nimh.nih.gov/about/director/messages/2023/making-an-impact-on-precision-medicine-in-psychiatry). https://reporter.nih.gov/search/9pb0zOfgl0WX7WJ2RDHdtA/project-details/10877461
This new initiative is focused on using behavioral measures and computational methods to define novel clinical signatures that can be used for individual-level prediction and clinical decision making in treating mental disorders. The importance of this new NIMH program was highlighted in a recent news update issued by the White House Office of Science and Technology. This study, titled, “Phenotypes REimagined to Define Clinical Treatment and Outcome Research (PREDiCTOR)”, will use objective, scalable, and cost-effective measurements to define novel clinical signatures that can be used for individual-level prediction and clinical decision-making in treating mental health disorders.
Responsibilities
-
Deploy and maintain software and workflows on local high performance computing platforms and cloud computing infrastructure (e.g., Amazon Web Services) to capture, manage, archive, and monitor multi-site, multi-modal study data.
-
Applications may include but are not limited to study monitoring systems, data management systems, workflow execution and monitoring systems, interactive viewers, and reporting tools, including, but not limited to, the following tools:
- Support data engineering efforts, including database and API design, data extraction/ transformation/load, and data aggregation/integration.
- Maintain and enhance feature engineering pipelines. They will design new processing pipelines, with an emphasis on version tracking, data provenance, and high-performance computing.
- Responsible for the integrity and security of data in all forms of storage throughout the Data Architecture.
- Works with other IT professionals through Mount Sinai effectively. Comply with the Institutional Review Board and HIPAA to follow all applicable policies and procedures.
- Assists in the development of standards and procedures affecting data management, design and maintenance. Documents all standards and procedures.
- Provides presentations and training to other team members in the above.
- As appropriate, writes and/or contributes to scientific publications and when needed, contributes to grants applications.
- Possesses an extremely flexible attitude. Willing to work with multiple types of technologies and languages with an open mind and without technology bias. Continuous interest in updating skill sets and knowledge of trends in the Big Data Technology space.
- Performs other duties as required.
-
Deploy and maintain pipelines for processing of audiovisual (AV), smartphone (via MindLAMP application), and Electronic Medical Record (EMR) data, including the following tools:
QualificationsEducation Requirements
- Bachelor’s degree in Computer Science or a related discipline; Advanced degree preferred.
Experience Requirements
- 3+ years relevant professional development experience
-
Strong knowledge of database management systems (SQL and NoSQL Databases)
- Experience with Amazon Web Service deployments (e.g., RDS, docDB, ECS, EC2, VPC, S3)
- Proficiency with multiple programming languages including Python software engineering skills. Must be flexible and fast to pick up new languages.
- Linux system administration skills (service updates, key management, security configuration, id management), including experience with NVIDIA/TensorFlow/Keras stack.
-
Machine learning/data science skills
-
Experience with big data
- Proficiency on installation and configuration of big data software and technology
- Familiarity with and the ability to leverage a wide variety of open-source technologies and tools (as described above).
-
Strong problem-solving and analytical skills
-
Excellent communication and collaboration abilities
- Adaptability and a willingness to learn new technologies and techniques
- Experience with version control (Git).