1. Understanding the data of Internet finance, e-commerce, operators and other industries, and developing high-quality data products related to users or businesses through big data mining technology can ensure the quality and timeliness of data.
2. Have a thorough understanding of the whole link of data acquisition, data fusion, data quality and data application, and can use innovative methods to solve practical problems in the link.
3. Familiar with the management and application of data assets, integration of business, products and other channels of data, and be able to collaborate with technology and data warehouse team, improve and manage word-of-mouth data assets;
4. To analyze and monitor data assets management and data quality management processes and links, find out potential loopholes and problems in time, and quickly solve them.
5. It can guarantee the robust and reliable operation of data application.

Job requirements
1. Bachelor degree or above, major in statistics, mathematics, computer, economics, etc.
2. Three years or more data analysis or data quality assurance experience, with strong data standardization and sensitivity, model development experience is preferred;
3. Experience in application testing and assurance, automated programming language is preferred.
4. Good communication and coordination skills and overall awareness, can quickly promote the implementation of work to the ground;
5. Experience in real-time data processing or wind control platform testing is preferred.


Job responsibilities:
1. Responsible for processing and testing of related data in big data department, functional testing of processing logic, white box testing, automated testing, etc.
2、Independent design and execution of design cases, defect tracking, development of test plans, and collaboration with development to achieve testing activities at all stages;
3. Develop test tools or automation solutions to improve test efficiency;
4. Controlling, identifying and preventing test risks and improving the test process of the project;
5 . The problems found in the test can be analyzed and positioned, communicate actively and effectively with developers and demanders, and promote problem solving.

Job requirements:
1. Bachelor degree or above, more than 3 years experience in software development or automated testing framework development;
2. Familiar with hadoop, hive and other data ecosystem related product knowledge is preferred.
3. Good compression resistance, independent problem solving ability, data warehouse, BI, large data solution related test experience is preferred;
4. Good language skills, good organizational and collaborative abilities;
5. Proficiency in Linux, at least familiar with Shell, Python, a language in Java, familiar with test frameworks such as Selenium is preferred;
6. Familiarity with Jenkins, Maven and other continuous build tools is preferred.
7. Familiar with database theory, skilled in SQL.

Look again, there’s no test, it’s just a data engineer.
Data Engineer
Require proficiency: hive, ETL
Work experience: 3 years or more (including 3 years)
Skillful use: API programming provided by MapReduce and Spark, with experience in mass data processing (ETL);
Familiarity: Hadoop ecological environment, Hadoop, Spark, Storm, HBase, etc. at least one project has in-depth understanding
Familiar with shell commands, simple shell programming; familiar with Linux text processing commands, VI, AWK, Sed and other commands
Skilled in using Hadoop or other distributed platform, can use java, Python or other languages to write MapReduce for large data processing priority;

Our company, with a dedicated data team, is responsible for tableau report generation, which is very dazzling.
Membership integral calculation, rank calculation, because these involve business logic, and so on are developed and tested, and then online;
Some customers involved in multi-party interaction through ETL, CLOVERETL on the designated sftp;

Development and testing, usually on Dev and QA databases
Members who want to run out of production, such as those who celebrate their birthday next month, will give away a cake coupon if there is a transaction in the logical year. In addition, BI personnel will take the number and add it to the business table of DB production.
BIThere are no testers.
And then often make mistakes…
Then it’s called test-assisted BI test, and the test does everything, reviews the code, writes the automated test script, and tests the data… And there is only one test for a project, and one test is responsible for multiple projects at the same time.


Then look at the data stored procedures.

Requirements are as follows: Please remove Mr/Ms/Ms/Customer/COACH/Customer in customer’s name in trigger. And these customer s won’t’t REceive our DM pack


The whole code is so messy that data engineers randomly generate temporary physical tables and then call them across stored procedures.

#tempThe provisional table is very optional, a B and 111 are all available.

Without describing it, asking themselves, they were also dizzy.

It’s tedious. It doesn’t need the simplest way to achieve it. It likes to write thousands of lines in the sun.



Then we meet new demand, because we are the long-term marketing department, not the product department. We will give a new patch or something in a short time in a day or two because of the customer’s pain.

As a result, the whole stored procedure becomes longer and longer, and more and more tables are involved.

Ultimately, no one can say what the stored procedure is doing with a simple description.


What should I do? Let’s test.


