Machine Learning With Wellness

[Big Data Home][Home]

This analysis focuses on Wellness Profiles in Scotland (the data is [here]) and an analysis of the data is [here]

Male life expectancy
Female life expectancy
Deaths all ages
All mortality among 15-44 year olds
Early deaths from CHD (<75)
Early deaths from cancer (<75)
Estimated smoking attributable deaths
Smoking prevalence (adults 16+)
Alcohol-related hospital stays
Deaths from alcohol conditions
Drug-related hospital stays
Active travel to work
New cancer registrations
Patients hospitalised with (COPD)
Patients hospitalised with coronary heart disease
Patients hospitalised with asthma
Patients with emergency hospitalisations
Patients (65+) with multiple emergency hospitalisations
Road traffic accident casualties
Population prescribed drugs for anxiety/depression/psychosis
Patients with a psychiatric hospitalisation
Deaths from suicide
Adults incapacity benefit/severe disability allow/employment allow
People aged 65+ with high care needs cared at home
Children looked after by local authority
Single adult dwellings
Average tariff score of all pupils on S4 roll
Primary school attendance,Secondary school attendance

We will use a machine learning method to take a sample of the data and learn with it. Then we will make predictions and see if we can predict the age of the patient from the data. Let's first read in the data:

import numpy as np
import pandas as pd
import sys
from sklearn.cross_validation import train_test_split
from sklearn.ensemble import RandomForestRegressor

ver=pd.read_csv("well.csv")

Now let's generate the training data (taking 50% of the data) to train the system. The rows will be taken randomly from the dataset, and we will use "Patients hospitalised with (COPD)","Patients hospitalised with asthma","Drug-related hospital stays","Alcohol-related hospital stays","Male life expectancy",and "All mortality among 15-44 year olds" to train the machine:

train, test, y_train, y_test = train_test_split(ver[["Patients hospitalised with (COPD)","Patients hospitalised with asthma","Drug-related hospital stays","Alcohol-related hospital stays","Male life expectancy","All mortality among 15-44 year olds"]],ver["Deaths from alcohol conditions"],test_size=0.5, random_state=1)

Now we will fit a model using the random forest method:

model= RandomForestRegressor(n_estimators=100,min_samples_leaf=10)

model.fit(train,y_train)

We should now have created our model. Let's make prediction on our data:

predictions =model.predict(test)

This will give us values for deaths from alchohol. Let's process these and define if the magnitude of the error is less than 5 we will see that as a success:

print ('%22s %s %s %s %s' % ("Area","Pred","Actual","Diff","Success"))

for x in range(0, len(predictions)):
	error =  abs(int(predictions[x])-ver["Deaths from alcohol conditions"][x])
	if (error<=5): 
		str = "Success"
		success=success+1
	else: 
		str="Failed!"
		failure = failure+1
	print('%22s %4d %4d %4d %s' % (ver["Area"][x],int(predictions[x]),ver["Deaths from alcohol conditions"][x],error,str) )

print ('Success: %3d Fail: %3d' % (success,failure))

If we run the model here are the results:

                  Area Pred Actual Diff Success
         Aberdeen City   19   21    2 Success
         Aberdeenshire   16    9    6 Failed!
                 Angus   18   16    2 Success
       Argyll and Bute   27   22    4 Success
      Clackmannanshire   19   19    0 Success
 Dumfries and Galloway   13   13    0 Success
           Dundee City   20   30   10 Failed!
         East Ayrshire   17   23    6 Failed!
   East Dunbartonshire   16   14    1 Success
     East Renfrewshire   15   14    0 Success
        Edinburgh City   30   21    9 Failed!
               Falkirk   19   18    0 Success
                  Fife   16   19    3 Success
          Glasgow City   18   40   22 Failed!
              Highland   29   22    6 Failed!
Success:   9 Fail:   6

We actually predicted 4 of the 15 correctly! We can estimate the success if we guess an age of 14 deaths each time, that gives us 11 values. So the random choice is around 43%, where we achieved 60% from a limited set of parameters ... and that's machine learning!

Demo

Here is the demo:

Referencing this page

This site is currently free to use and does not contain any advertisements, but should be properly referenced when used in the dissemination of knowledge, including within blogs, research papers and other related activities.Sample reference forms are given below.

Ref: Buchanan, William J (2024). Machine Learning on Wellness. Asecuritysite.com. https://asecuritysite.com/bigdata/well_ai

Bib: @misc{asecuritysite_96409, title = {Machine Learning on Wellness}, year={2024}, organization = {Asecuritysite.com}, author = {Buchanan, William J}, url = {https://asecuritysite.com/bigdata/well_ai}, note={Accessed: April 26, 2024}, howpublished={\url{https://asecuritysite.com/bigdata/well_ai}} }

Licence: This site is intended for the education and advancement of humans, and no rights are given for AI and ML bots to crawl this site. All references to its content must be included.

Follow @billatnapier Tweet #Asecuritysite