Data Engineer

Location:South Yorkshire
Job Type:Full Time
Apply Now

Description

Data Engineer

GCB5

Some careers growfaster than others.

If you’re looking for a career that will give you plenty ofopportunities to develop, join HSBC and your future will be rich withpotential. Whether you want a career that could take you to the top, or simplytake you in an exciting new direction, HSBC offers opportunities, support andrewards that will take you further.

The Compliance Global Function within HSBC is responsible forimplementing effective global standards to combat financial crime, including:Anti-Money Laundering (AML), Sanctions and Anti-Bribery & Corruption(AB&C) compliance.

The Financial Crime Threat Mitigation (FCTM) department withinCompliance includes investigators, intelligence analysts, data scientists,subject matter experts, liaison specialists and innovation specialists. FCTM isresponsible for identifying, analysing & investigating financial crimerisk, and ensuring that these risks are properly mitigated within HSBC. Thecurrent FCTM systems architecture is based on a large on-premise Hadoopplatform (aka Compliance Data Lake), which supports a large number of dataprocessing and analytics workloads.

HSBC IT has an ambitious strategy to move all on-premise analyticsworkloads onto Google Cloud (GCP). At the same time, Compliance has started alarge business programme (ILFCRM or Intelligence Lead Financial Crime RiskManagement) to radically improve HSBC's financial crime risk systems &processes.

One of the objectives of ILFCRM is to build a new GCP application(called DRA or Dynamic Risk Assessment) that uses advanced machine learningtechniques. DRA will provide (a) data scientists a secure environment fordeveloping & training new ML models, and (b) an execution environment forrunning approved ML models in a controlled operational environment.

This role is for a Data Engineer in the DRA IT team. The successfulcandidate will help us develop and build-out our new DRA application in GCP.

Key duties include:

  • Helpto deliver GCP development projects (including: design, development/coding,testing & deployment into Production) e.g.
  • example1: Build event driven pipeline: a message is published by an HSBC system >Pub/Sub > Dataflow > model prediction > save results in BigQuery.
  • example2: Build batch process: large file is generated in an HSBC Hadoop cluster >batch job transfers file to Google Cloud Storage > data is copied toBigQuery.
  • example3: Write a complex Dataflow/Beam job in Java or Python to merge customer &reference data > update the customer profile in Bigtable > re-calculatecustomer risk score.
  • example4: Build a secure analytics environment in GCP for Data Scientists (they wantto use tools like... Python, R, Jupyter notebook, ML libraries, BigQuery).
  • Collaboratewith central teams (architecture, security, engineering, networks) who areresponsible for delivering HSBC’s baseline cloud architecture.
  • Collaboratewith other development teams that are also working to deliver use cases on GCP(share best practice).