So recently I’ve changed jobs. About three months ago, I started as a QA in a Data Analytics team, working for an e-commerce company.
This change turned out to be pretty rough — since I’ve been testing mostly UI and APIs for the last couple of years — and Analytics is a different creature. Like most people, sometimes I love it, and sometimes I hate it (and usually I’m at somewhere in between). Here is my experience:
First off: How to get a job as a QA for a Data Analytics team?
I am not an expert in Database testing, but I had been using some SQL Server, MySQL and MongoDB in the past. Having experience working with different databases is always a plus, but not exclusive, and usually most Testers have to do some type of data validation at some point in any project.
My take from the interview I had was:
- Be ready to speak about the different validation experiences you have with data.
- Remember a few scenarios you tested, and walk your interviewer through them. This way you can show them you know what you are talking about, and even if these cases don’t apply 100% to Analytics, it’s always a good exercise to show your soft skills, like clear communication, understanding the tasks you work with and deep knowledge of the type of testing you are discussing.
- Do some research about the most common tools in Analytics so you already have an idea of the work in case there is some activity or question of any potential testing you might have to perform. Tools like Sigma, QlikView and PowerBI are some of the most used.
First week in Data Analytics
If you are like me and everything is new (and exciting), the first one or two weeks will be to get to know the company, the product, and the work methodology. In my case, I had an onboarding process in which the first part consisted of the initial configurations for all the tools and credentials of the company, and some hands-on training to start practicing.
I also had 1:1 with people on my team, I had Devs helping me with installing some of the software, and a lot of meetings with the BA of the team, who would explain the reports and the sources of the data.
Basically, their job is to feed some Sigma reports with data from different teams in the company, who already have their reports in excel files and our job is to migrate those into a much nicer and aesthetic automated report. Reports can vary on the theme but the metrics and graphics are usually repeated across several workbooks. Sales, Shipments, Returns, Estimates… all generic names that can be applied to most e-commerce companies. These reports are usually audited since their data is used for decision-making. This is why it’s important to make sure that we are reporting the correct one.
What do I test?
- Table validations: Tables are usually populated from Software used by the employees. Can be an ERP, an e-commerce app, a logistic software made for a Distribution Center, etc.
- View validations: Similar to tables, views are populated by different sources. Tables are combined to allocate and present specific data, that can be used to populate a report with the information already filtered out.
- Datasets: Datasets are previews of data already organized in the Reporting tool. The data is prepared to be displayed in the Workbook and there is a custom SQL that feeds it. It can consume different views and tables and have a logic of its own.
- Workbooks: The actual report. This is the final product and what would also be validated by the Stakeholders. It is usually divided into different tabs and data is presented in metrics, graphs, and charts. Functions and calculations can be specific for a metric and the formulas are present within the Workbook to validate.
What type of validations do I do?
There are standard validations that apply to all of the above examples. Some of them are:
- Spotchecks against the UI: Take one row returned from the query and validate the data stored in the different columns against the actual source.
e.g: The unit_price column shows $20.50 for Sales Order #12345. Let’s open the app and check the field. Is it 20.50? If that’s so, repeat the same process a couple of times until you can make sure that a fair amount of rows are showing the correct information.
- Column names and Data Types: Check the column names against the requirements. Do they match? Is it the same unit_return as return_unit? And how about the data types? Use the ticket as a reference or ask your BA to complete the information for you to validate. It’s not the same for a column to store an integer as it is to store a varchar.
- Formulas and calculations: This is my least favorite part, but you may like it. It’s almost certain that while working on reports for financial or sales teams, you would have some amount of metrics and formulas to validate. First, validate that the formula is correct. Second, validate that it is doing what it is supposed to be doing. Do a manual example, look at the fields, and try to do it on your own. Do you get the same results?
- Row counts: One of the best ways to make sure that we are including all the data from the table or the view in the Workbook is to do a row count and compare both. Are we missing rows? Is there any logic in that Tab or section that can be excluding rows? Maybe it’s grouping by any filter? Do we have more rows? Let’s check both and see which ones are missing. Talk to the developer to understand what could be causing the issue.
- Typos and UI: Although there is not much UI to work with, there is always a level of customization made to these reports. Can be a logo, a date added, a text, an “About” page, or even the names of the columns and the column comments. Typos and simple errors are made more often than not, and if missed can be perceived as unprofessional.
- Filters: Last but not least. Filters are key in Data Analytics. It’s important to make sure that they are working, and not only that but also showing the data based on the filter selected. Can we do a multiple selection? Can we select several filters at once?
Conclusion and thoughts:
This is a learning process for me. After working for seven years in Manual and Automated testing, this is my first time working exclusively with data. I’m sure there are many more validations, tools, and ways to work with it, and possibly many more ways to improve these processes (and especially to get better at formula validations) but to conclude this text I would like to say:
- Excel never gets old.
- The more SQL you know, the better and easier the job will get
- Don’t be afraid to work on something you know very little (or nothing) about.