Meeting with Trâm Vo, Data Wrangler

Published on 14/04/2021

Data & AI Stories, meet the expert: what is a Data Wrangler?

 
Trâm Vo
 

Data and artificial intelligence (AI) are the cornerstones of our digital transformation strategy and support our three strategic priorities: client centricity, operational efficiency and responsibility. Today, with a portfolio of more than 250 Data and AI use cases in production, we are confirming our ability to transform ourselves.

The challenge for Societe Generale is now to accelerate this transformation, in particular by relying on multi-disciplinary teams of experts, capable of rapidly building effective and value-creating Data & AI solutions.

These are the experts we have chosen to introduce to you in order to understand their jobs, discover their career paths and how they are participating in the digital transformation underway within our Group. Following Marion Cabrol, Data Scientist at the Investment Banking Digital Office, we met with Trâm Vo, Data Wrangler on the Model and Data Science team of the Risk Department.

Trâm, how would you explain your job “for dummies”?

The Wrangler is a cowboy, the one in charge of gathering his herd. The Data Wrangler is therefore the expert who gathers heterogeneous data to transform it and arrange it in standardised databases for specific business uses: reporting with large volumes of heterogeneous data, data mapping to meet business needs, reference data sets for modelling and research on big data.

How did you become a Data Wrangler?

Before becoming a Data Wrangler, I was a credit analyst in the back office of a retail bank in France and then internationally. I was in direct contact with clients, a position I held for 10 years, which enabled me to acquire banking knowledge that went far beyond granting credit.

Because I was handling a lot of data, I realised that the means at our disposal needed to be improved to make our decision-making in granting credit more fluid. This observation made me want to understand the data cycle from its source, its supply, but also to contribute to its better use.

In 2016, I joined the cross-functional risk management team in the Risk Department as a “Data Preparer”. There, I perfected my knowledge of banking and data manipulation in a data-oriented team, mainly to the benefit of financial communication: we were already doing data wrangling without even knowing it!

Five years later, the role is recognised, and I am now a Data Wrangler in the first ever Data Wrangling team created in the Risk Department.

What is the role of the Data Wrangler and what are your main activities?

In the context of a data or AI project, the Data Wrangler is responsible for the “Data Preparation”. Preparing the data requires several steps.

First of all, I search and retrieve the data useful to the project from the internal systems, this step is called "Data collection". I then ensure the completeness and rationality of the data collected, this is the “Data Quality Check” stage. After this step, I enter the data transformation phase during which I structure the data and apply “Data Transformation” filters as required. The last step is to guarantee the traceability of the data used and transformed, and for this purpose I draw up clear “Data Audit Trail” documentation.

If changes occur during the course of the project, the Data Wrangler is required to modulate his/her operations in an agile way in conjunction with the project stakeholders: we offer “tailor-made” solutions.

Which other business, IT and data divisions do you collaborate with in the Group?

We collaborate with all divisions because each one can be a source of data for the projects, especially when this data is not yet available centrally: it is then the division that allows access to the data.

When data is centralised and made available, we work with IT to extract the data we need.

Another team we are cooperating with, and that we will soon be joining, is the Risk Departments' Digital Transformation Office: we are working together on data recovery projects in a more agile way.

Can you share with us a project on which the collaboration of a Data Wrangler was decisive?

The Haussmann programme is a major project at Group level, launched in September 2019 at the request of the regulator. The objective of this programme is to simplify the structure of our internal risk-bearing (IRB) models. It is coordinated by the Risk Model Management Team.

The Data Wrangler plays a key role from the start of the project. We are at the heart of the team and take part in strategic discussions to jointly determine a coherent and realistic roadmap with the other stakeholders: providing feedback on technical constraints, implementing business management rules, organising clustered data, etc.

Our intervention is essential for the smooth running of the subsequent data preparation, and each iteration is qualitative and useful for the progress of the project.

Do all Data & AI use cases involve a Data Wrangler?

Yes, necessarily. Wrangling is a preliminary and essential step. It is carried out more or less consciously by the data users. And the larger the datasets, the more important the Wrangling function.

How do you see this profession evolving within the Group?

At present, the digital transformation underway within the Group is focusing on the divisions related to the handling of data. Our team is developing strongly, recruitments are continuing in 2021 and relays are being created in certain departments thanks to companionship to better absorb needs, for example with the non-retail credit portfolio analysis team.

Data Wrangling is an expertise that can be found everywhere and whose practices are applicable to all divisions as long as the data can be accessed. In the long term, the idea is of course to professionalise each team so that data wrangling can spread to all levels.

What skills do you think are required to become a Data Wrangler?

There is no typical profile for becoming a good Data Wrangler. On the contrary, multidisciplinarity is a major asset within the team.

In my opinion, there are 5 advantages to becoming a Data Wrangler:

  1. Know how to cultivate a collaborative spirit: you never work alone, but always in a team
  2. Use your logical mind: be able to reason logically, be methodical and agile
  3. A solid knowledge of banking: this is a major asset, because in order to put the data away properly, you need to understand it well.
  4. Be curious: depending on your profile, either more banking or more “Data Oriented”, if you are curious you can balance your profile by soaking up the multidisciplinarity of the team, everyone benefits from complementary skills and this pulls the team up.
  5. Have coding knowledge: we now have new powerful platforms in Data Preparation such as Alteryx and Dataiku. These are powerful tools that revolutionise our work and allow us to free ourselves a little from the technicalities of computer language but knowing how to code remains a real asset for manipulating data.

In the end, Wrangling is above all about knowing how to question the quality of the metrics used, and how to translate the customer's needs.