Column
|
Description
|
Changes
|
Missing or Invalid Values
|
Notes
|
Client_ID
|
Identification of the Client
|
None
|
N/A
|
Primary Key
|
BirthNumber
|
Birthday and gender
|
Removed
This is a 6-digit number. The documentation says that its format is as follows:
- YYMMDD
(Men)
- YYMM50+DD
(Women)
Analysis suggests that the format is as follows:
- YYMMDD
(Men)
- YY50+MMDD
(Women)
Format changed to:
- MM/DD/YYYY
- Created a new field for Client_Sex
|
N/A
|
Attribute removed from dataset and replaced by attributes Client_Sex and Client_Age
|
District_ID
|
Client's district of residence
|
None
|
N/A
|
Foreign Key
|
Client_Sex
|
Gender derived from BirthNumber attribute
|
Added
Built during data pre-processing.
Values are MALE (M) and FEMALE (F)
|
N/A
|
Distribution of Values:
- 49% Male
- 51% Female
|
Client_Age
|
Discretized value derived from BirthNumber attribute
Values:
- 1 = YOUTH
(0 - 24)
- 2 = ADULT
(24 - 35)
- 3 = MIDDLE-AGE
(36 - 64)
- 4 = SENIOR
(65 - *)
|
Added
Built during data pre-processing.
|
N/A
|
None
|