Full time Remote / Telecommute

Data Engineer

Durham, NC
0 other recent jobs
Created: September 29, 2021

Description

The Duke Law Library’s Empirical Research and Data Support Services (ERDSS) unit seeks applications for the position of Data Engineer. ERDSS serves the dynamic empirical research needs of Law School faculty, students, and staff by cultivating a distribution of skills and talents that effectively support research across the entire lifecycle of an empirical project. In response to growing interest in applied data science experience by Law School stakeholders, the Data Engineer will enhance ERDSS capacity to create, maintain, and promote research-ready data deliverables for faculty, staff, and students.

Under the supervision of the Associate Director for ERDSS and in concert with other ERDSS team members, the Data Engineer will develop and execute data preparation workflows that accurately and efficiently address the data needs of project stakeholders. The Data Engineer will also support ERDSS in efforts to promote overall data literacy in legal scholarship and practice at Duke Law and engage regularly with like-minded individuals across the broader Duke University community. 

Responsibilities
•    Coordinate and monitor the design, development, modification, and implementation of information technology applications to combine, manipulate, and transform datasets of varying size, scope, and substance into a format that allows researchers to more easily view, interpret, and analyze data.
•    Extract, clean, and organize raw information harvested from different electronic and web-based sources–including social media platforms, websites, and PDF documents.
•    Develop API calls and other programming functions to automate the collection and management of frequently used research data sources.
•    Supervise and direct the performance of one or more student research assistants.
•    Participate in development of long-range planning for and prioritization of new and existing projects.
•    Ensure that external and internal regulations and policies governing data management are met including regulations concerning security, auditability, and privacy. 
•    Advise students, faculty, and staff on the use of data management concepts and tools–in and/or outside a classroom setting.
•    Participate in a lively community of data support professionals at Duke University and the broader Research Triangle area.
•    Perform other duties as assigned.

Required Qualifications
Education/Training

Work requires a Bachelor's degree in mathematics, computer science, or a computer related field.

Experience
Work requires 4 years of progressive programming or database administration experience to include design, implementation, tuning, backup, recovery, modification and reorganization of relational databases for a complex computer network.
OR AN EQUIVALENT COMBINATION OF RELEVANT EDUCATION AND/OR EXPERIENCE

Desired Qualifications
•    Master’s degree in computer/data science or an applied social science field
•    Strong technical skills and problem-solving abilities related to research data management
•    Demonstrated experience with web scraping and electronic document parsing
•    Proficiency in Python and/or R programming languages
•    Working knowledge of HTML, JavaScript, and web design to facilitate entity extraction from web-based sources
•    Familiarity with distributed version-control systems using Git
•    Demonstrated interpersonal and teamwork skills complemented by the ability to take initiative
•    Excellent oral and written communication skills
•    Experience or familiarity with legal information
•    Demonstrated experience collaborating with academic stakeholders on applied data projects
•    Applied experience with natural language processing and machine-learning algorithms
•    Proficiency with Linux-based operating systems
•    Understanding of metadata standards in a research data context
•    Demonstrated experience using virtual machine environments to manipulate datasets
•    Familiarity with the production and deployment of interactive data applications/dashboards

Metadata

Published: Wednesday, September 29, 2021 19:01 UTC


Last updated: Wednesday, September 29, 2021 19:01 UTC