Janice2023-04-29 17:27:38
老师您好,请问第三问 题目中提到 preparation of the textual data,为什么老师默认是cleasing,我理解preparation 包含了cleaning & wrangling,为什么这里wrangling包含的部分就不是题目的答案呢?
回答(1)
Vincent2023-05-01 20:57:14
你好
书上把这个知识点叫做DATA PREPARATION AND WRANGLING,preparation是指cleansing, 不包含wrangling。
还有书上也给了具体解释:
Data Preparation (Cleansing): This is the initial and most common task in data
preparation that is performed on raw data. Data cleansing is the process of examining,
identifying, and mitigating errors in raw data. Normally, the raw data are
neither sufficiently complete nor sufficiently clean to directly train the ML model.
Manually entered data can have incomplete, duplicated, erroneous, or inaccurate
values. Automated data (recorded by systems) can have similar problems due to server
failures and software bugs.
Data Wrangling (Preprocessing): This task performs transformations and critical
processing steps on the cleansed data to make the data ready for ML model training.
Raw data most commonly are not present in the appropriate format for model consumption.
After the cleansing step, data need to be processed by dealing with outliers,
extracting useful variables from existing data points,
- 评论(0)
- 追问(0)
评论
0/1000
追答
0/1000
+上传图片

