UE8: notebook séance 2
UE8: Notebook de la séance 2
# Pour utiliser les bibliothèques
import pandas
Chargement des données du fichier Donnees_M2_RD.xlsx
dfxls = pandas.read_excel("Donnees_M2_RD.xlsx")
Chargement des dataframes expérimentaux du M1 dans un tableau
expes = []
for i in range(8):
filename = "expe/subject-"+str(i)+".csv"
print("Loading "+filename)
df = pandas.read_csv(filename)
df['Subject'] = i
expes.append(df)
dfm1 = pandas.concat(expes)
Loading expe/subject-0.csv
Loading expe/subject-1.csv
Loading expe/subject-2.csv
Loading expe/subject-3.csv
Loading expe/subject-4.csv
Loading expe/subject-5.csv
Loading expe/subject-6.csv
Loading expe/subject-7.csv
dfxls
Subject | Name_A | Name_B | Dist_A | Dist_B | Mode | Space | Side | Response | RT | |
---|---|---|---|---|---|---|---|---|---|---|
0 | P_ADI_331 | 0 | 2 | 2 | 4 | Dic | E | D | 2 | 18865 |
1 | P_ADI_331 | 1 | 4 | 4 | 1 | Dic | E | D | 2 | 13157 |
2 | P_ADI_331 | 4 | 3 | 3 | 2 | Dic | E | D | 1 | 11628 |
3 | P_ADI_331 | 2 | 4 | 4 | 1 | Dic | E | D | 1 | 10068 |
4 | P_ADI_331 | 1 | 2 | 2 | 4 | Dic | E | D | 1 | 11801 |
... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... |
9589 | P_VAR_330 | 0 | 1 | 3 | 5 | Dio | I | D | 1 | 7626 |
9590 | P_VAR_330 | 3 | 2 | 5 | 1 | Dio | I | D | 2 | 6349 |
9591 | P_VAR_330 | 2 | 0 | 4 | 2 | Dio | I | D | 2 | 9031 |
9592 | P_VAR_330 | 0 | 2 | 2 | 1 | Dio | I | D | 2 | 16323 |
9593 | P_VAR_330 | 0 | 3 | 5 | 1 | Dio | I | D | 2 | 10139 |
9594 rows × 10 columns
dfm1
acc | accuracy | average_response_time | avg_rt | background | canvas_backend | clock_backend | color_backend | correct | correct_message_debut_tirage | ... | time_tirage | time_tirage_loop | time_welcome | tirage | title | total_correct | total_response_time | total_responses | width | Subject | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
0 | 0 | 0 | 1012 | 1012 | #3d3846 | legacy | legacy | legacy | 0 | undefined | ... | 54014 | 53974 | 1329 | 20 | Nouvelle expérience | 0 | 1012.0 | 1 | 1024 | 0 |
1 | 100 | 100 | 662 | 662 | #3d3846 | legacy | legacy | legacy | 1 | undefined | ... | 54014 | 53974 | 1329 | 20 | Nouvelle expérience | 1 | 662.0 | 1 | 1024 | 0 |
2 | 0 | 0 | 710 | 710 | #3d3846 | legacy | legacy | legacy | 0 | undefined | ... | 54014 | 53974 | 1329 | 20 | Nouvelle expérience | 0 | 710.0 | 1 | 1024 | 0 |
3 | 0 | 0 | 742 | 742 | #3d3846 | legacy | legacy | legacy | 0 | undefined | ... | 54014 | 53974 | 1329 | 20 | Nouvelle expérience | 0 | 742.0 | 1 | 1024 | 0 |
4 | 0 | 0 | 806 | 806 | #3d3846 | legacy | legacy | legacy | 0 | undefined | ... | 54014 | 53974 | 1329 | 20 | Nouvelle expérience | 0 | 806.0 | 1 | 1024 | 0 |
... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... |
433 | 100 | 100 | 305 | 305 | #3d3846 | legacy | legacy | legacy | 1 | undefined | ... | 790915 | 7847 | 1443 | 5 | Nouvelle expérience | 1 | 305.0 | 1 | 1024 | 7 |
434 | 0 | 0 | 290 | 290 | #3d3846 | legacy | legacy | legacy | 0 | undefined | ... | 790915 | 7847 | 1443 | 5 | Nouvelle expérience | 0 | 290.0 | 1 | 1024 | 7 |
435 | 0 | 0 | 260 | 260 | #3d3846 | legacy | legacy | legacy | 0 | undefined | ... | 790915 | 7847 | 1443 | 5 | Nouvelle expérience | 0 | 260.0 | 1 | 1024 | 7 |
436 | 0 | 0 | 554 | 554 | #3d3846 | legacy | legacy | legacy | 0 | undefined | ... | 790915 | 7847 | 1443 | 5 | Nouvelle expérience | 0 | 554.0 | 1 | 1024 | 7 |
437 | 0 | 0 | 605 | 605 | #3d3846 | legacy | legacy | legacy | 0 | undefined | ... | 790915 | 7847 | 1443 | 5 | Nouvelle expérience | 0 | 605.0 | 1 | 1024 | 7 |
3504 rows × 110 columns
for col in df.axes[1]:
print(col)
acc
accuracy
average_response_time
avg_rt
background
canvas_backend
clock_backend
color_backend
correct
correct_message_debut_tirage
correct_reponse_question1
correct_reponse_question2
correct_welcome
count_boucle_images
count_choix_csv
count_choix_csv2
count_config_script
count_experiment
count_getting_started
count_image
count_image_et_questions
count_message_debut_tirage
count_question1
count_question1_feedback
count_question2
count_question2_feedback
count_reponse_question1
count_reponse_question2
count_resultats_logger
count_tirage
count_tirage_loop
count_welcome
datetime
description
disable_garbage_collection
experiment_file
experiment_path
fichier_csv
font_bold
font_family
font_italic
font_size
font_underline
foreground
form_clicks
fullscreen
height
image
keyboard_backend
live_row
live_row_boucle_images
live_row_tirage_loop
logfile
mouse_backend
nombre_tirages
num_tirage
opensesame_codename
opensesame_version
q1
q2
rep_autorisee1
rep_autorisee2
rep_ok1
rep_ok2
repeat_cycle
response
response_message_debut_tirage
response_reponse_question1
response_reponse_question2
response_time
response_time_message_debut_tirage
response_time_reponse_question1
response_time_reponse_question2
response_time_welcome
response_welcome
round_decimals
sampler_backend
sound_buf_size
sound_channels
sound_freq
sound_sample_size
start
subject_nr
subject_parity
time_boucle_images
time_choix_csv
time_choix_csv2
time_config_script
time_experiment
time_getting_started
time_image
time_image_et_questions
time_message_debut_tirage
time_question1
time_question1_feedback
time_question2
time_question2_feedback
time_reponse_question1
time_reponse_question2
time_resultats_logger
time_tirage
time_tirage_loop
time_welcome
tirage
title
total_correct
total_response_time
total_responses
width
Subject
dfxls["Subject"].drop_duplicates()
0 P_ADI_331
400 P_ALM_345
800 P_AMY_346
1200 P_BAM_347
1600 P_BEH_340
2000 P_BLC_325
2399 P_BLR_321
2798 P_BOA_321
3197 P_BOC_342
3597 P_CAR_327
3995 P_CAV_333
4395 P_CON_336
4795 P_GAM_338
5195 P_GHM_334
5595 P_GRC_341
5995 P_GRF_322
6394 P_LAC_354
6794 P_LEG_335
7194 P_MOE_339
7594 P_ROS_336
7994 P_SOA_337
8394 P_TAI_343
8794 P_VAL_329
9194 P_VAR_330
Name: Subject, dtype: object
dfxls
Subject | Name_A | Name_B | Dist_A | Dist_B | Mode | Space | Side | Response | RT | |
---|---|---|---|---|---|---|---|---|---|---|
0 | P_ADI_331 | 0 | 2 | 2 | 4 | Dic | E | D | 2 | 18865 |
1 | P_ADI_331 | 1 | 4 | 4 | 1 | Dic | E | D | 2 | 13157 |
2 | P_ADI_331 | 4 | 3 | 3 | 2 | Dic | E | D | 1 | 11628 |
3 | P_ADI_331 | 2 | 4 | 4 | 1 | Dic | E | D | 1 | 10068 |
4 | P_ADI_331 | 1 | 2 | 2 | 4 | Dic | E | D | 1 | 11801 |
... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... |
9589 | P_VAR_330 | 0 | 1 | 3 | 5 | Dio | I | D | 1 | 7626 |
9590 | P_VAR_330 | 3 | 2 | 5 | 1 | Dio | I | D | 2 | 6349 |
9591 | P_VAR_330 | 2 | 0 | 4 | 2 | Dio | I | D | 2 | 9031 |
9592 | P_VAR_330 | 0 | 2 | 2 | 1 | Dio | I | D | 2 | 16323 |
9593 | P_VAR_330 | 0 | 3 | 5 | 1 | Dio | I | D | 2 | 10139 |
9594 rows × 10 columns
rt = dfxls["RT"]
rt
0 18865
1 13157
2 11628
3 10068
4 11801
...
9589 7626
9590 6349
9591 9031
9592 16323
9593 10139
Name: RT, Length: 9594, dtype: int64
rt.max()
103152
rt.mean()
12050.088597039816
rt.std()
7085.96882950782
rt.quantile(0.4)
9315.000000000002
dfxls.min()
Subject P_ADI_331
Name_A 0
Name_B 0
Dist_A 1
Dist_B 1
Mode Dic
Space E
Side D
Response 1
RT 2703
dtype: object
rt.quantile([0.25,0.5,0.75])
0.25 7896.25
0.50 10262.50
0.75 13696.50
Name: RT, dtype: float64
condition = (rt >= 12000)
condition
0 True
1 True
2 False
3 False
4 False
...
9589 False
9590 False
9591 False
9592 True
9593 False
Name: RT, Length: 9594, dtype: bool
rt
0 18865
1 13157
2 11628
3 10068
4 11801
...
9589 7626
9590 6349
9591 9031
9592 16323
9593 10139
Name: RT, Length: 9594, dtype: int64
dfxls
Subject | Name_A | Name_B | Dist_A | Dist_B | Mode | Space | Side | Response | RT | |
---|---|---|---|---|---|---|---|---|---|---|
0 | P_ADI_331 | 0 | 2 | 2 | 4 | Dic | E | D | 2 | 18865 |
1 | P_ADI_331 | 1 | 4 | 4 | 1 | Dic | E | D | 2 | 13157 |
2 | P_ADI_331 | 4 | 3 | 3 | 2 | Dic | E | D | 1 | 11628 |
3 | P_ADI_331 | 2 | 4 | 4 | 1 | Dic | E | D | 1 | 10068 |
4 | P_ADI_331 | 1 | 2 | 2 | 4 | Dic | E | D | 1 | 11801 |
... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... |
9589 | P_VAR_330 | 0 | 1 | 3 | 5 | Dio | I | D | 1 | 7626 |
9590 | P_VAR_330 | 3 | 2 | 5 | 1 | Dio | I | D | 2 | 6349 |
9591 | P_VAR_330 | 2 | 0 | 4 | 2 | Dio | I | D | 2 | 9031 |
9592 | P_VAR_330 | 0 | 2 | 2 | 1 | Dio | I | D | 2 | 16323 |
9593 | P_VAR_330 | 0 | 3 | 5 | 1 | Dio | I | D | 2 | 10139 |
9594 rows × 10 columns
Ici condition
est une série de booléens
dfxls[condition]
Subject | Name_A | Name_B | Dist_A | Dist_B | Mode | Space | Side | Response | RT | |
---|---|---|---|---|---|---|---|---|---|---|
0 | P_ADI_331 | 0 | 2 | 2 | 4 | Dic | E | D | 2 | 18865 |
1 | P_ADI_331 | 1 | 4 | 4 | 1 | Dic | E | D | 2 | 13157 |
5 | P_ADI_331 | 2 | 1 | 2 | 3 | Dic | E | D | 2 | 12117 |
6 | P_ADI_331 | 2 | 1 | 3 | 4 | Dic | E | D | 1 | 16347 |
7 | P_ADI_331 | 0 | 3 | 2 | 4 | Dic | E | D | 1 | 13237 |
... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... |
9582 | P_VAR_330 | 1 | 0 | 3 | 5 | Dio | I | D | 2 | 15942 |
9583 | P_VAR_330 | 0 | 4 | 4 | 5 | Dio | I | D | 2 | 45627 |
9585 | P_VAR_330 | 0 | 3 | 2 | 3 | Dio | I | D | 2 | 16671 |
9586 | P_VAR_330 | 0 | 1 | 2 | 3 | Dio | I | D | 1 | 18002 |
9592 | P_VAR_330 | 0 | 2 | 2 | 1 | Dio | I | D | 2 | 16323 |
3336 rows × 10 columns
rt = dfxls["RT"]
cond = rt >= 12000
dfxls[cond]
Subject | Name_A | Name_B | Dist_A | Dist_B | Mode | Space | Side | Response | RT | |
---|---|---|---|---|---|---|---|---|---|---|
0 | P_ADI_331 | 0 | 2 | 2 | 4 | Dic | E | D | 2 | 18865 |
1 | P_ADI_331 | 1 | 4 | 4 | 1 | Dic | E | D | 2 | 13157 |
5 | P_ADI_331 | 2 | 1 | 2 | 3 | Dic | E | D | 2 | 12117 |
6 | P_ADI_331 | 2 | 1 | 3 | 4 | Dic | E | D | 1 | 16347 |
7 | P_ADI_331 | 0 | 3 | 2 | 4 | Dic | E | D | 1 | 13237 |
... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... |
9582 | P_VAR_330 | 1 | 0 | 3 | 5 | Dio | I | D | 2 | 15942 |
9583 | P_VAR_330 | 0 | 4 | 4 | 5 | Dio | I | D | 2 | 45627 |
9585 | P_VAR_330 | 0 | 3 | 2 | 3 | Dio | I | D | 2 | 16671 |
9586 | P_VAR_330 | 0 | 1 | 2 | 3 | Dio | I | D | 1 | 18002 |
9592 | P_VAR_330 | 0 | 2 | 2 | 1 | Dio | I | D | 2 | 16323 |
3336 rows × 10 columns
dfxls[dfxls["RT"] >= 12000]
Subject | Name_A | Name_B | Dist_A | Dist_B | Mode | Space | Side | Response | RT | |
---|---|---|---|---|---|---|---|---|---|---|
0 | P_ADI_331 | 0 | 2 | 2 | 4 | Dic | E | D | 2 | 18865 |
1 | P_ADI_331 | 1 | 4 | 4 | 1 | Dic | E | D | 2 | 13157 |
5 | P_ADI_331 | 2 | 1 | 2 | 3 | Dic | E | D | 2 | 12117 |
6 | P_ADI_331 | 2 | 1 | 3 | 4 | Dic | E | D | 1 | 16347 |
7 | P_ADI_331 | 0 | 3 | 2 | 4 | Dic | E | D | 1 | 13237 |
... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... |
9582 | P_VAR_330 | 1 | 0 | 3 | 5 | Dio | I | D | 2 | 15942 |
9583 | P_VAR_330 | 0 | 4 | 4 | 5 | Dio | I | D | 2 | 45627 |
9585 | P_VAR_330 | 0 | 3 | 2 | 3 | Dio | I | D | 2 | 16671 |
9586 | P_VAR_330 | 0 | 1 | 2 | 3 | Dio | I | D | 1 | 18002 |
9592 | P_VAR_330 | 0 | 2 | 2 | 1 | Dio | I | D | 2 | 16323 |
3336 rows × 10 columns
dfxls["RT" >= 12000]
---------------------------------------------------------------------------
TypeError Traceback (most recent call last)
Cell In[30], line 1
----> 1 dfxls["RT" >= 12000]
TypeError: '>=' not supported between instances of 'str' and 'int'
# on se rappelle que rt contient dfxls["RT"]
dfxls[rt >= 12000]
Subject | Name_A | Name_B | Dist_A | Dist_B | Mode | Space | Side | Response | RT | |
---|---|---|---|---|---|---|---|---|---|---|
0 | P_ADI_331 | 0 | 2 | 2 | 4 | Dic | E | D | 2 | 18865 |
1 | P_ADI_331 | 1 | 4 | 4 | 1 | Dic | E | D | 2 | 13157 |
5 | P_ADI_331 | 2 | 1 | 2 | 3 | Dic | E | D | 2 | 12117 |
6 | P_ADI_331 | 2 | 1 | 3 | 4 | Dic | E | D | 1 | 16347 |
7 | P_ADI_331 | 0 | 3 | 2 | 4 | Dic | E | D | 1 | 13237 |
... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... |
9582 | P_VAR_330 | 1 | 0 | 3 | 5 | Dio | I | D | 2 | 15942 |
9583 | P_VAR_330 | 0 | 4 | 4 | 5 | Dio | I | D | 2 | 45627 |
9585 | P_VAR_330 | 0 | 3 | 2 | 3 | Dio | I | D | 2 | 16671 |
9586 | P_VAR_330 | 0 | 1 | 2 | 3 | Dio | I | D | 1 | 18002 |
9592 | P_VAR_330 | 0 | 2 | 2 | 1 | Dio | I | D | 2 | 16323 |
3336 rows × 10 columns
sup_8000 = (rt >= 8000)
inf_12000 = (rt <= 12000)
sup_8000
0 True
1 True
2 True
3 True
4 True
...
9589 False
9590 False
9591 True
9592 True
9593 True
Name: RT, Length: 9594, dtype: bool
inf_12000
0 False
1 False
2 True
3 True
4 True
...
9589 True
9590 True
9591 True
9592 False
9593 True
Name: RT, Length: 9594, dtype: bool
entre_8000_12000 = (sup_8000 & inf_12000)
entre_8000_12000
0 False
1 False
2 True
3 True
4 True
...
9589 False
9590 False
9591 True
9592 False
9593 True
Name: RT, Length: 9594, dtype: bool
dfxls[entre_8000_12000]
Subject | Name_A | Name_B | Dist_A | Dist_B | Mode | Space | Side | Response | RT | |
---|---|---|---|---|---|---|---|---|---|---|
2 | P_ADI_331 | 4 | 3 | 3 | 2 | Dic | E | D | 1 | 11628 |
3 | P_ADI_331 | 2 | 4 | 4 | 1 | Dic | E | D | 1 | 10068 |
4 | P_ADI_331 | 1 | 2 | 2 | 4 | Dic | E | D | 1 | 11801 |
9 | P_ADI_331 | 2 | 1 | 4 | 2 | Dic | E | D | 2 | 10973 |
11 | P_ADI_331 | 1 | 2 | 4 | 2 | Dic | E | D | 2 | 11471 |
... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... |
9576 | P_VAR_330 | 1 | 2 | 2 | 4 | Dio | I | D | 1 | 9141 |
9578 | P_VAR_330 | 2 | 3 | 2 | 4 | Dio | I | D | 2 | 11208 |
9581 | P_VAR_330 | 3 | 2 | 5 | 4 | Dio | I | D | 1 | 11327 |
9591 | P_VAR_330 | 2 | 0 | 4 | 2 | Dio | I | D | 2 | 9031 |
9593 | P_VAR_330 | 0 | 3 | 5 | 1 | Dio | I | D | 2 | 10139 |
3761 rows × 10 columns
dfxls[(dfxls["RT"] >= 8000) & (dfxls["RT"] <= 12000)]
Subject | Name_A | Name_B | Dist_A | Dist_B | Mode | Space | Side | Response | RT | |
---|---|---|---|---|---|---|---|---|---|---|
2 | P_ADI_331 | 4 | 3 | 3 | 2 | Dic | E | D | 1 | 11628 |
3 | P_ADI_331 | 2 | 4 | 4 | 1 | Dic | E | D | 1 | 10068 |
4 | P_ADI_331 | 1 | 2 | 2 | 4 | Dic | E | D | 1 | 11801 |
9 | P_ADI_331 | 2 | 1 | 4 | 2 | Dic | E | D | 2 | 10973 |
11 | P_ADI_331 | 1 | 2 | 4 | 2 | Dic | E | D | 2 | 11471 |
... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... |
9576 | P_VAR_330 | 1 | 2 | 2 | 4 | Dio | I | D | 1 | 9141 |
9578 | P_VAR_330 | 2 | 3 | 2 | 4 | Dio | I | D | 2 | 11208 |
9581 | P_VAR_330 | 3 | 2 | 5 | 4 | Dio | I | D | 1 | 11327 |
9591 | P_VAR_330 | 2 | 0 | 4 | 2 | Dio | I | D | 2 | 9031 |
9593 | P_VAR_330 | 0 | 3 | 5 | 1 | Dio | I | D | 2 | 10139 |
3761 rows × 10 columns
rt = dfxls["RT"]
df_8_12 = dfxls[(rt >= 8000) & (rt <= 12000)]
df_8_12["Name_A"]
2 4
3 2
4 1
9 2
11 1
..
9576 1
9578 2
9581 3
9591 2
9593 0
Name: Name_A, Length: 3761, dtype: int64
dfxls[dfxls["RT"] >= 80000]["Subject"].drop_duplicates()
2798 P_BOA_321
3299 P_BOC_342
6447 P_LAC_354
9091 P_VAL_329
9425 P_VAR_330
Name: Subject, dtype: object