Exercises

  1. Discuss the advantage and disadvantage of fill Age missing value with a sample that has the same mean and std.
  2. Re-engineer Cabin to capture social status and relations by creating new features to reflect more accurate relations with survive. Discuss their generalisation and specification.
  3. When we make up missing values of the Embarked attribute we want to compare the price of the ticket the passenger paid with other tickets’ price to allocate the possible embarked port. It all works well, however, one of the factors we did not consider is the variation of the price on Pclass. We have knowledge that the higher class the more expensive the price will be. Can you analyze the price per ticket with the Pclass to see if it can produce conflict results against the allocation of the embarked port by price comparison?