목록KAGGLE/Intro to Machine Learning (4)
yoooniverse
In [1]: ls sample_data/ train.csv A quick look at data In [26]: import pandas as pd iowa_file_path = 'train.csv' home_data = pd.read_csv(iowa_file_path) home_data.describe() Out[26]: Id MSSubClass LotFrontage LotArea OverallQual OverallCond YearBuilt YearRemodAdd MasVnrArea BsmtFinSF1 ... WoodDeckSF OpenPorchSF EnclosedPorch 3SsnPorch ScreenPorch PoolArea MiscVal MoSold YrSold SalePrice ..
Underfitting When a model fails to capture important distinctions and patterns in the data, so it performs poorly even in training data. Failing to capture relevant patterns, again leads to less accurate predictions. Overfitting where a model matches the training data almost perfectly but does poorly in validation and other new data. Since we care about the accurac..
the relevant measure of model quality: predictive accuracy (which means, will the model's predictions be close to what actually happens?) (1) First, need to summarize the model quality in an understandable way. We need to summarize this into a single metric. 여기서 사용할 metric : Mean Absolute Error (also called MAE) MAE의 원리 prediction error for each house: error = actual − predi..
Learn Tutorial: Intro to Machine Learning 에 대한 정리 Learn Intro to Machine Learning Tutorials Learn the core ideas in machine learning, and build your first models. www.kaggle.com 시나리오: 부동산 투자로 큰 돈을 번 cousin. data science에 관심있는 나에게 비즈니스 파트너가 되어줄 것을 제안. 사촌은 돈을 조달하고, 나는 모델을 제공한다. 어떤 모델? 다양한 주택의 가치를 예측하는 모델 사촌은 어떻게 부동산 투자로 돈을 벌었는가? 과거에 봤던 주택들의 가격 변화 패턴을 파악하고, 이 패턴을 적용해 새로운 주택의 가격에 대한 prediction을 한다. ..