Full time Remote / Telecommute

Senior Software Engineer, Archiving & Data Services

0 other recent jobs
Created: August 23, 2021


Interested in a mission-driven job ensuring perpetual open access to information for a global audience? Enjoy helping scale the use of services and products critical to hundreds of national and international non-profits, libraries, universities, cultural heritage institutions, and mission-aligned organizations? If so, the Internet Archive is seeking a Senior Software Engineer for its Archiving & Data Services team. The Internet Archive (IA) is a non-profit digital library, top 200 website at archive.org, and an archive of over 80PB of digital information running in multiple self-owned data centers. The Internet Archive also partners with organizations worldwide to advance the shared goal of “Universal Access to All Knowledge.” The Archiving & Data Services group provides a suite of paid and free products focused on the archiving, management, analysis, and accessibility of digital information. Its web archiving, digital preservation, and web and data services are used by over 800 organizations around the world.

The Archiving & Data Services team is seeking a Senior Software Engineer to advance and expand our suite of web archiving, digital preservation, and data services. The ideal candidate has developed new products, systems, and applications, maintained a code base for multiple years, and has experience working in the domains of archiving, data processing, data engineering, or deep experience with web protocols and technologies. The role will contribute to a variety of projects, often in a technical leadership capacity, including feature development on mature, SaaS products, new application development, helping build core technologies for new products, API and microservices development, systems architecting, managing clustered deployments of harvesting and processing tools, and assisting with ongoing operational improvements. Our services enable users to archive web-published at scale and to completeness, process terabytes and petabytes of archived data, facilitate the discovery and use of these archived collections, and to curate and manage a diverse set of born-digital material. The Senior Software Engineer will have the unique opportunity to build things that enable non-profit cultural heritage organizations around the world to build collections for future research, scholarship, and memory.

Essential Job Duties:

  • Contribute to, and often lead, complex engineering projects developing new products, new features for mature, widely-used platforms, and critical infrastructure for collecting, processing, and making accessible petabyte-scale amounts of data.

  • Demonstrated experience of delivering on commitments with deadlines and project timelines and working in a collaborative, distributed team of junior and senior engineers

  • Measure production system performance and benchmark new tools

  • Participate in, and help propagate, engineering team procedures and ceremonies including testing, code review, documentation, retrospectives, team RFCs, et cetera

  • Lead large scale deployments and releases

  • Work closely with product, support, and other non-technical teams to translate requirements and features into technical designs -- strong communication and collaboration skills are a must.

  • Configuration, maintenance, and improvement of tool stacks related to web crawling, data processing, access systems, open APIs, and databases.

Qualification and Skills:

  • Significant experience building web-facing applications

  • Advanced Python on the server, or other language mastery with a willingness to go all-in on Python. Django experience is a plus.

  • Experience with Javascript, Angular or Web Components is a plus

  • Experience with development environments and system monitoring/administration tools and experience with open source practices, version control, and code review

  • Experience with Postgres, Elasticsearch, Hadoop/HDFS experience or strong experience in one of these technologies is preferred 

  • Ansible, GitLab, GitHub, Sentry, Grafana, JIRA, are other tools we use.

  • Our independently operated data centers run Ubuntu Linux VMs and our department  runs everything from the VM up, so deep Linux experience is a plus

  • We are open to candidates with either a frontend or backend emphasis

Job Details:

This is a remote-first position working in a distributed team. Candidates will need to have some time overlap with a primarily North America (and mostly Pacific Time) based distributed team for collaborative work and meetings. The role reports to the Senior Engineering Manager, Archiving & Data Services.


Benefits & Perks:

The Internet Archive is a remote first workplace and provides a comprehensive benefits package including; PTO, paid holidays, and medical benefits. Depending on where you live, we also provide these additional benefits; dental, vision, health savings accounts, flex spending accounts, commuter benefits, short term disability, long term disability and retirement programs.

At the Internet Archive, we believe we do our best work when our employees bring together diverse ideas. Members of all groups under represented in the tech industry and library world are strongly encouraged to apply. We are proud to be an equal opportunity workplace and are committed to equal employment opportunity regardless of race, color, religion, national origin, age, sex, marital status, ancestry, physical or mental disability, genetic information, veteran status, gender identity or expression, sexual orientation, or any other characteristic protected by applicable federal, state or local law.

Last updated: Monday, August 23, 2021 19:35 UTC