Notebook Séance 3
UE8: Introduction à l’analyse de données
import pandas
df = pandas.read_excel("Donnees_M2_RD.xlsx")
df
Subject | Name_A | Name_B | Dist_A | Dist_B | Mode | Space | Side | Response | RT | |
---|---|---|---|---|---|---|---|---|---|---|
0 | P_ADI_331 | 0 | 2 | 2 | 4 | Dic | E | D | 2 | 18865 |
1 | P_ADI_331 | 1 | 4 | 4 | 1 | Dic | E | D | 2 | 13157 |
2 | P_ADI_331 | 4 | 3 | 3 | 2 | Dic | E | D | 1 | 11628 |
3 | P_ADI_331 | 2 | 4 | 4 | 1 | Dic | E | D | 1 | 10068 |
4 | P_ADI_331 | 1 | 2 | 2 | 4 | Dic | E | D | 1 | 11801 |
... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... |
9589 | P_VAR_330 | 0 | 1 | 3 | 5 | Dio | I | D | 1 | 7626 |
9590 | P_VAR_330 | 3 | 2 | 5 | 1 | Dio | I | D | 2 | 6349 |
9591 | P_VAR_330 | 2 | 0 | 4 | 2 | Dio | I | D | 2 | 9031 |
9592 | P_VAR_330 | 0 | 2 | 2 | 1 | Dio | I | D | 2 | 16323 |
9593 | P_VAR_330 | 0 | 3 | 5 | 1 | Dio | I | D | 2 | 10139 |
9594 rows × 10 columns
df[(df['Dist_A'] >= 3)]
Subject | Name_A | Name_B | Dist_A | Dist_B | Mode | Space | Side | Response | RT | |
---|---|---|---|---|---|---|---|---|---|---|
1 | P_ADI_331 | 1 | 4 | 4 | 1 | Dic | E | D | 2 | 13157 |
2 | P_ADI_331 | 4 | 3 | 3 | 2 | Dic | E | D | 1 | 11628 |
3 | P_ADI_331 | 2 | 4 | 4 | 1 | Dic | E | D | 1 | 10068 |
6 | P_ADI_331 | 2 | 1 | 3 | 4 | Dic | E | D | 1 | 16347 |
8 | P_ADI_331 | 2 | 0 | 4 | 2 | Dic | E | D | 2 | 12589 |
... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... |
9588 | P_VAR_330 | 0 | 3 | 3 | 1 | Dio | I | D | 2 | 6153 |
9589 | P_VAR_330 | 0 | 1 | 3 | 5 | Dio | I | D | 1 | 7626 |
9590 | P_VAR_330 | 3 | 2 | 5 | 1 | Dio | I | D | 2 | 6349 |
9591 | P_VAR_330 | 2 | 0 | 4 | 2 | Dio | I | D | 2 | 9031 |
9593 | P_VAR_330 | 0 | 3 | 5 | 1 | Dio | I | D | 2 | 10139 |
5757 rows × 10 columns
df[(df['Dist_A'] >= 3) & (df['Dist_B'] >= 3)]
Subject | Name_A | Name_B | Dist_A | Dist_B | Mode | Space | Side | Response | RT | |
---|---|---|---|---|---|---|---|---|---|---|
6 | P_ADI_331 | 2 | 1 | 3 | 4 | Dic | E | D | 1 | 16347 |
10 | P_ADI_331 | 1 | 4 | 4 | 5 | Dic | E | D | 1 | 14774 |
13 | P_ADI_331 | 0 | 3 | 4 | 3 | Dic | E | D | 2 | 14330 |
15 | P_ADI_331 | 1 | 3 | 4 | 3 | Dic | E | D | 2 | 10828 |
17 | P_ADI_331 | 3 | 0 | 3 | 4 | Dic | E | D | 1 | 10438 |
... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... |
9579 | P_VAR_330 | 4 | 3 | 3 | 4 | Dio | I | D | 2 | 12271 |
9581 | P_VAR_330 | 3 | 2 | 5 | 4 | Dio | I | D | 1 | 11327 |
9582 | P_VAR_330 | 1 | 0 | 3 | 5 | Dio | I | D | 2 | 15942 |
9583 | P_VAR_330 | 0 | 4 | 4 | 5 | Dio | I | D | 2 | 45627 |
9589 | P_VAR_330 | 0 | 1 | 3 | 5 | Dio | I | D | 1 | 7626 |
2879 rows × 10 columns
rt = df['RT']
rt.median()
10262.5
Donner le nombre de sujets.
# subjects = df['Subject']
# subjects_uniques = subjects.drop_duplicates()
# subjects_uniques.count()
df['Subject'].drop_duplicates().count()
24
Donner le nombre d’essais pour lesquels le deuxième prénom est Justin.
justin_deuxieme = df[(df['Name_B'] == 3)]
justin_deuxieme['Subject'].count()
# justin_deuxieme
1918
Donner le nombre d’essais ayant un temps de réponse supérieur à 20000.
df[df['RT'] > 20000]['Subject'].count()
943
Donner la moyenne des temps de réponse lorsque le prénom joué en premier est celui du sujet.
rtm_0 = df[df['Name_A'] == 0]['RT'].mean()
rtm_g = df['RT'].mean()
(rtm_g - rtm_0) / rtm_g
-0.00866465320898404
Les sujets ayant au moins une fois répondu en plus de 50000.
df[df['RT'] > 50000]['Subject'].drop_duplicates()
650 P_ALM_345
1733 P_BEH_340
2798 P_BOA_321
3232 P_BOC_342
3872 P_CAR_327
4953 P_GAM_338
6447 P_LAC_354
7609 P_ROS_336
8424 P_TAI_343
9091 P_VAL_329
9373 P_VAR_330
Name: Subject, dtype: object
df['RT'] / 10000
0 1.8865
1 1.3157
2 1.1628
3 1.0068
4 1.1801
...
9589 0.7626
9590 0.6349
9591 0.9031
9592 1.6323
9593 1.0139
Name: RT, Length: 9594, dtype: float64
df[(df['Dist_A'] - df['Dist_B']) > 0]
Subject | Name_A | Name_B | Dist_A | Dist_B | Mode | Space | Side | Response | RT | |
---|---|---|---|---|---|---|---|---|---|---|
1 | P_ADI_331 | 1 | 4 | 4 | 1 | Dic | E | D | 2 | 13157 |
2 | P_ADI_331 | 4 | 3 | 3 | 2 | Dic | E | D | 1 | 11628 |
3 | P_ADI_331 | 2 | 4 | 4 | 1 | Dic | E | D | 1 | 10068 |
8 | P_ADI_331 | 2 | 0 | 4 | 2 | Dic | E | D | 2 | 12589 |
9 | P_ADI_331 | 2 | 1 | 4 | 2 | Dic | E | D | 2 | 10973 |
... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... |
9588 | P_VAR_330 | 0 | 3 | 3 | 1 | Dio | I | D | 2 | 6153 |
9590 | P_VAR_330 | 3 | 2 | 5 | 1 | Dio | I | D | 2 | 6349 |
9591 | P_VAR_330 | 2 | 0 | 4 | 2 | Dio | I | D | 2 | 9031 |
9592 | P_VAR_330 | 0 | 2 | 2 | 1 | Dio | I | D | 2 | 16323 |
9593 | P_VAR_330 | 0 | 3 | 5 | 1 | Dio | I | D | 2 | 10139 |
4797 rows × 10 columns
df['diff_dist'] = df['Dist_A'] - df['Dist_B']
df
Subject | Name_A | Name_B | Dist_A | Dist_B | Mode | Space | Side | Response | RT | diff_dist | |
---|---|---|---|---|---|---|---|---|---|---|---|
0 | P_ADI_331 | 0 | 2 | 2 | 4 | Dic | E | D | 2 | 18865 | -2 |
1 | P_ADI_331 | 1 | 4 | 4 | 1 | Dic | E | D | 2 | 13157 | 3 |
2 | P_ADI_331 | 4 | 3 | 3 | 2 | Dic | E | D | 1 | 11628 | 1 |
3 | P_ADI_331 | 2 | 4 | 4 | 1 | Dic | E | D | 1 | 10068 | 3 |
4 | P_ADI_331 | 1 | 2 | 2 | 4 | Dic | E | D | 1 | 11801 | -2 |
... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... |
9589 | P_VAR_330 | 0 | 1 | 3 | 5 | Dio | I | D | 1 | 7626 | -2 |
9590 | P_VAR_330 | 3 | 2 | 5 | 1 | Dio | I | D | 2 | 6349 | 4 |
9591 | P_VAR_330 | 2 | 0 | 4 | 2 | Dio | I | D | 2 | 9031 | 2 |
9592 | P_VAR_330 | 0 | 2 | 2 | 1 | Dio | I | D | 2 | 16323 | 1 |
9593 | P_VAR_330 | 0 | 3 | 5 | 1 | Dio | I | D | 2 | 10139 | 4 |
9594 rows × 11 columns
rt_p = df[df['RT'] <= 20000]
rt_g = df[df['RT'] > 20000]
pandas.concat([rt_g,rt_p])
Subject | Name_A | Name_B | Dist_A | Dist_B | Mode | Space | Side | Response | RT | diff_dist | |
---|---|---|---|---|---|---|---|---|---|---|---|
400 | P_ALM_345 | 4 | 0 | 3 | 1 | Dio | I | D | 2 | 26000 | 2 |
407 | P_ALM_345 | 3 | 4 | 4 | 2 | Dio | I | D | 2 | 21956 | 2 |
411 | P_ALM_345 | 0 | 1 | 4 | 5 | Dio | I | D | 1 | 27613 | -1 |
428 | P_ALM_345 | 3 | 2 | 5 | 2 | Dio | I | D | 1 | 20589 | 3 |
439 | P_ALM_345 | 4 | 2 | 4 | 5 | Dio | I | D | 1 | 22400 | -1 |
... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... |
9589 | P_VAR_330 | 0 | 1 | 3 | 5 | Dio | I | D | 1 | 7626 | -2 |
9590 | P_VAR_330 | 3 | 2 | 5 | 1 | Dio | I | D | 2 | 6349 | 4 |
9591 | P_VAR_330 | 2 | 0 | 4 | 2 | Dio | I | D | 2 | 9031 | 2 |
9592 | P_VAR_330 | 0 | 2 | 2 | 1 | Dio | I | D | 2 | 16323 | 1 |
9593 | P_VAR_330 | 0 | 3 | 5 | 1 | Dio | I | D | 2 | 10139 | 4 |
9594 rows × 11 columns