Notebook de la séance 9

Connexion à la base de donnée et chargement de Pandas

%load_ext sql
%config SqlMagic.dsn_filename = "connections.ini"
%sql --section bd-pedago
%sql --connections
import pandas

Connecting to 'bd-pedago'

Chargement des données excel en base en passant par une dataframe df

df = pandas.read_excel("Donnees_M2_RD.xlsx")
%sql --persist-replace df

Running query in 'bd-pedago'

Success! Persisted df to the database.

Récupérer les données de toute la table

%%sql
SELECT *
FROM df

Running query in 'bd-pedago'

9594 rows affected.

index Subject Name_A Name_B Dist_A Dist_B Mode Space Side Response RT
0 P_ADI_331 0 2 2 4 Dic E D 2 18865
1 P_ADI_331 1 4 4 1 Dic E D 2 13157
2 P_ADI_331 4 3 3 2 Dic E D 1 11628
3 P_ADI_331 2 4 4 1 Dic E D 1 10068
4 P_ADI_331 1 2 2 4 Dic E D 1 11801
5 P_ADI_331 2 1 2 3 Dic E D 2 12117
6 P_ADI_331 2 1 3 4 Dic E D 1 16347
7 P_ADI_331 0 3 2 4 Dic E D 1 13237
8 P_ADI_331 2 0 4 2 Dic E D 2 12589
9 P_ADI_331 2 1 4 2 Dic E D 2 10973
Truncated to displaylimit of 10.

Si on veut seulement 2 colonnes

%%sql
SELECT "Subject", "RT"
FROM df

Running query in 'bd-pedago'

9594 rows affected.

Subject RT
P_ADI_331 18865
P_ADI_331 13157
P_ADI_331 11628
P_ADI_331 10068
P_ADI_331 11801
P_ADI_331 12117
P_ADI_331 16347
P_ADI_331 13237
P_ADI_331 12589
P_ADI_331 10973
Truncated to displaylimit of 10.

En filtrant des lignes

%%sql
SELECT *
FROM df
WHERE "Space" = 'E'
  AND "Name_A" <> 0
  AND "Name_B" = 1

Running query in 'bd-pedago'

720 rows affected.

index Subject Name_A Name_B Dist_A Dist_B Mode Space Side Response RT
5 P_ADI_331 2 1 2 3 Dic E D 2 12117
6 P_ADI_331 2 1 3 4 Dic E D 1 16347
9 P_ADI_331 2 1 4 2 Dic E D 2 10973
14 P_ADI_331 2 1 1 3 Dic E D 1 13386
16 P_ADI_331 4 1 1 5 Dic E D 1 13876
19 P_ADI_331 3 1 2 3 Dic E D 2 12041
35 P_ADI_331 3 1 1 4 Dic E D 1 8414
36 P_ADI_331 3 1 2 4 Dic E D 1 9523
41 P_ADI_331 4 1 1 4 Dic E D 1 12265
63 P_ADI_331 4 1 4 2 Dio E G 2 9596
Truncated to displaylimit of 10.

En filtrant des lignes et des colonnes

%%sql
-- un commentaire
SELECT "Subject", "RT"
FROM df
WHERE "Space" = 'E'
  AND "Name_A" <> 0
  AND "Name_B" = 1

Running query in 'bd-pedago'

720 rows affected.

Subject RT
P_ADI_331 12117
P_ADI_331 16347
P_ADI_331 10973
P_ADI_331 13386
P_ADI_331 13876
P_ADI_331 12041
P_ADI_331 8414
P_ADI_331 9523
P_ADI_331 12265
P_ADI_331 9596
Truncated to displaylimit of 10.

En calculant une nouvelle colonne dans le SELECT

%%sql
SELECT "Subject", "Dist_A" - "Dist_B" as "DiffDist"
FROM df

Running query in 'bd-pedago'

9594 rows affected.

Subject DiffDist
P_ADI_331 -2
P_ADI_331 3
P_ADI_331 1
P_ADI_331 3
P_ADI_331 -2
P_ADI_331 -1
P_ADI_331 -1
P_ADI_331 -2
P_ADI_331 2
P_ADI_331 2
Truncated to displaylimit of 10.

On peut remarquer que la table df reste inchangée:

%%sql
SELECT *
FROM df

Running query in 'bd-pedago'

9594 rows affected.

index Subject Name_A Name_B Dist_A Dist_B Mode Space Side Response RT
0 P_ADI_331 0 2 2 4 Dic E D 2 18865
1 P_ADI_331 1 4 4 1 Dic E D 2 13157
2 P_ADI_331 4 3 3 2 Dic E D 1 11628
3 P_ADI_331 2 4 4 1 Dic E D 1 10068
4 P_ADI_331 1 2 2 4 Dic E D 1 11801
5 P_ADI_331 2 1 2 3 Dic E D 2 12117
6 P_ADI_331 2 1 3 4 Dic E D 1 16347
7 P_ADI_331 0 3 2 4 Dic E D 1 13237
8 P_ADI_331 2 0 4 2 Dic E D 2 12589
9 P_ADI_331 2 1 4 2 Dic E D 2 10973
Truncated to displaylimit of 10.
%%sql
SELECT "Subject", "Space", avg("RT") as "moy_RT", min("RT") as min_RT
FROM df
WHERE "Mode" = 'Dic'
GROUP BY "Subject", "Space"

Running query in 'bd-pedago'

48 rows affected.

Subject Space moy_RT min_rt
P_ADI_331 E 11341.280000000000 6277
P_ADI_331 I 9017.5200000000000000 5832
P_ALM_345 E 14881.910000000000 4696
P_ALM_345 I 11411.900000000000 3978
P_AMY_346 E 10409.870000000000 4080
P_AMY_346 I 9199.4200000000000000 2703
P_BAM_347 E 9476.2800000000000000 4166
P_BAM_347 I 9578.7300000000000000 3510
P_BEH_340 E 14479.700000000000 7171
P_BEH_340 I 12668.770000000000 5676
Truncated to displaylimit of 10.
Emmanuel Coquery
Emmanuel Coquery
Maître de conférences en Informatique