Présidentielles 2022
Cahier de laboratoire

Dans ce document nous recensons les analyses que nous avons pu faire au jour le jours.

(pyvenv-workon 'polls)

Liens utiles

Lien vers les pages présidentielles 2022 des instituts de sondage:

Voir aussi le site de la Commission des Sondages.

Décembre 2021

Odoxa 09.12

Premier sondage d'Odoxa après l'officialisation de la candidature de Pécresse pour Les Républicains.

num_exprimes = 1391
precision = 0.25
resultats = {
    "Arthaud": 1,
    "Poutou": 1.5,
    "Roussel": 2,
    "Mélenchon": 10,
    "Montebourg": 1,
    "Hidalgo": 3,
    "Jadot": 6,
    "Macron": 24,
    "Pécresse": 19,
    "Dupont-Aignan": 2.5,
    "Zemmour": 12,
    "Le Pen": 17,
    "Lassalle": 1,
}
assert sum(list(resultats.values())) == 100
with pm.Model() as bva:
    prior_intentions = np.array(
        [1, 1.5, 2, 10, 1, 3, 6, 24, 19, 2.5, 12, 17, 1]
    ) * 0.1
    results_r = np.array(list(resultats.values())) / 100
    precision_r = precision / 100

    p = pm.Dirichlet("intentions", prior_intentions)  # Prior too vague?
    r = pm.Dirichlet("real_ratios", num_exprimes * p, observed=results_r)

    trace = pm.sample()

figureEAx2hR.png

intentions_r = {k: trace['intentions'][:,i] for i,k in enumerate(resultats.keys())}
fig = src.intentions.plot(
    intentions_r,
    colors,
    "09/12/2021",
    "Odoxa pour L'OBS et mascaret",
    title="Intentions de vote au premier tour",
    sample_size=num_exprimes,
    base="Comptant aller voter et exprimant une opinion",
    logo_path="~/org/roam/images/logo.png"
)
plt.tight_layout()
plt.savefig(filename, dpi=600, bbox_inches="tight")
filename

figure7ejmnm.png

Odoxa ne spécifie pas le % de gens qui sont surs de leur choix, impossible de donner d'autre information.

BVA 08.12

num_exprimes = 894
precision = 0.25
resultats = {
    "Arthaud": .5,
    "Poutou": 1.5,
    "Mélenchon": 9,
    "Roussel": 2.5,
    "Montebourg": 1,
    "Hidalgo": 5,
    "Jadot": 7,
    "Macron": 24,
    "Pécresse": 17,
    "Dupont-Aignan": 2.5,
    "Zemmour": 13,
    "Le Pen": 16,
    "Lassalle": 1,
}
assert sum(list(resultats.values())) == 100
with pm.Model() as bva:
    prior_intentions = np.array(
        [.5, 1.5, 9, 2.5, 1, 5, 7, 24, 17, 2.5, 13, 16, 1]
    )
    results_r = np.array(list(resultats.values())) / 100
    precision_r = precision / 100

    p = pm.Dirichlet("intentions", prior_intentions)  # Prior too vague?
    r = pm.Dirichlet("real_ratios", num_exprimes * p, observed=results_r)

    trace = pm.sample()

figure0p2VLy.png

intentions_r = {k: trace['intentions'][:,i] for i,k in enumerate(resultats.keys())}
fig = src.intentions.plot(
    intentions_r,
    colors,
    "08/12/2021",
    "BVA pour Orange et RTL",
    title="Intentions de vote au premier tour",
    sample_size=num_exprimes,
    base="Certains d'aller voter et exprimant une opinion",
    logo_path="~/org/roam/images/logo.png"
)
plt.tight_layout()
plt.savefig(filename, dpi=600, bbox_inches="tight")
filename

figurePTbhcD.png

Il serait quand même judicieux de mettre les gens n'ayant pas exprimé d'opinion sur les graphes.

Intentions de vote des gens sûrs leur choix

import math

certains_total = 71
certains = {
    "Mélenchon": 74,
    "Hidalgo": 51,
    "Jadot": 48,
    "Macron": 73,
    "Pécresse": 60,
    "Zemmour": 65,
    "Le Pen": 74,
}

# On fait l'hypothèse (assez bien vérifié quand on regarde les chiffres)
resultats_certains = {}
total = 0
remaining = 0
for i, c in enumerate(resultats):
    try:
        num_certains = trace['intentions'][:, i] * certains[c] / 100
        resultats_certains[c] = num_certains
        total += num_certains
    except:
        resultats_certains[c] = trace['intentions'][:, i]
        total += trace['intentions'][:, i]

for c in resultats:
    resultats_certains[c] /= total
intentions_r = {k: v for k, v in resultats_certains.items()}
fig = src.intentions.plot(
    intentions_r,
    colors,
    "08/12/2021",
    "BVA pour Orange et RTL",
    title="Intentions de vote au premier tour",
    sample_size=num_exprimes,
    base="Certains d'aller voter et sûrs de leur choix",
    logo_path="~/org/roam/images/logo.png"
)
plt.tight_layout()
plt.savefig(filename, dpi=600, bbox_inches="tight")
filename

figurehbOufB.png

Distribution multinomiale comme modèle d'observation   model

Je prends pour acquis depuis le début que le modèle Dirichlet-Dirichlet donne les mêmes résultats que le modèle Dirichlet-Multinomial correspondant, mais cela n'a rien d'évident.

J'ai rencontré l'idée la première fois dans cette thèse de master (3.1.2.5). Alors que la pertinence du modèle d'observation multinomial se justifie très bien ici, celle du modèle d'observation dirichlet pour les ratios est à confirmer (les résultats ont l'air semblables) mathématiquement.

Les instituts de sondages ne donnent pas les valeurs des intentions brutes mais il les arrondissent à l'entier le plus proche (ou demi-point de pourcentage le plus proche) ce qui induit une incertitude supplémentaire. Décidons d'abandonner le modèle Dirichlet-Dirichlet pour l'instant pour revenir à un modèle Dirichlet-Multinomial basique. On y inclut directement l'effet de l'arrondi, en disant que l'on n'oberver pas directement le ratio \(r\) mais \(\tilde{r}\) :

\begin{align*}
  \boldsymbol{p} &\sim \operatorname{Dirichlet}(\boldsymbol{\alpha})\\
  \mathbf{n}  &\sim \operatorname{Multinomial}\left(\mathbf{p}, N)\\
  \mathbf{r}  &= \frac{\mathbf{n}}{N}\\
  \tilde{\mathbf{r}} &\sim \operatorname{Uniform}(\mathrm{r}-\delta, \mathrm{r}+\delta)\\
\end{align*}
\begin{align*} \boldsymbol{p} &\sim \operatorname{Dirichlet}(\boldsymbol{\alpha})\\ \mathbf{n} &\sim \operatorname{Multinomial}\left(\mathbf{p}, N)\\ \mathbf{r} &= \frac{\mathbf{n}}{N}\\ \tilde{\mathbf{r}} &\sim \operatorname{Uniform}(\mathrm{r}-\delta, \mathrm{r}+\delta)\\ \end{align*}

Le modèle s'implémente très facilement dans PyMC3:

results_r = np.array(list(resultats.values())) / 100
precision_r = precision / 100

with pm.Model() as multinomial:
    prior_intentions = np.array(
        [.5, 1.5, 9, 2.5, 1, 5, 7, 24, 17, 2.5, 13, 16, 1]
    ) * 0.1
    p = pm.Dirichlet("intentions", prior_intentions, shape=(1,len(prior_intentions)))
    n = pm.Multinomial("respondants", num_exprimes, p, shape=(1, len(prior_intentions)))
    r = n / num_exprimes
    r_obs = pm.Uniform('observed', r-precision_r, r+precision_r, observed=results_r)

    trace = pm.sample()

figure3RFaIr.png

intentions_r = {k: trace['intentions'][:,0, i] for i,k in enumerate(resultats.keys())}
fig = src.intentions.plot(
    intentions_r,
    colors,
    "08/12/2021",
    "BVA pour Orange et RTL",
    title="Intentions de vote au premier tour",
    sample_size=num_exprimes,
    base="Certains d'aller voter et exprimant une opinion",
    logo_path="~/org/roam/images/logo.png"
)
plt.tight_layout()
plt.savefig(filename, dpi=600, bbox_inches="tight")
print(filename)

None

On voit que les intervalles de confiance sont légèrement élargis. /Supposons maintenant que les résultats sont données à plus ou moins un point près!

results_r = np.array(list(resultats.values())) / 100
precision_r = 1. / 100

with pm.Model() as multinomial:
    prior_intentions = np.array(
        [.5, 1.5, 9, 2.5, 1, 5, 7, 24, 17, 2.5, 13, 16, 1]
    )
    p = pm.Dirichlet("intentions", prior_intentions, shape=(1,len(prior_intentions)))
    n = pm.Multinomial("respondants", num_exprimes, p, shape=(1, len(prior_intentions)))
    r = n / num_exprimes
    r_obs = pm.Uniform('observed', r-precision_r, r+precision_r, observed=results_r)

    trace = pm.sample()

figureqJfd6w.png

intentions_r = {k: trace['intentions'][:,0, i] for i,k in enumerate(resultats.keys())}
fig = src.intentions.plot(
    intentions_r,
    colors,
    "08/12/2021",
    "BVA pour Orange et RTL",
    title="Intentions de vote au premier tour",
    sample_size=num_exprimes,
    base="Certains d'aller voter et exprimant une opinion",
    logo_path="~/org/roam/images/logo.png"
)
plt.tight_layout()
plt.savefig(filename, dpi=600, bbox_inches="tight")
filename

figureQeoqFc.png

Pairwise comparisons   viz

[2021-12-14 Tue]

Randomly scattering points does not give great results. can probably improve the layout using blue noise. The idea to get a good enough plot (we're not aiming for accuracy on these plots) would be to:

  1. Generate a set of points \(N_p\) st \(N_p \gg 100\) between -20% et +20%
  2. For each simulation, find the point with the closest x value. Set to occupied.
  3. Then only display the occupied circles.

[2021-12-15 Wed]

Dans le contexte d'une primaire à gauche comparons les résultats des différents candidats:

atyK4m.png

regardons les résultats potentiels de "l'union de la gauche":

t4Ws7E.png

import matplotlib.pyplot as plt
import os
from pygifsicle import optimize
import imageio

reference = "Le Pen"
challenger = "Pécresse"

wins = np.ceil(100 * np.sum(intentions_r[reference]>intentions_r[challenger]) / len(intentions_r[reference]))

filenames = []
for i in range(1, 100):
    if i % 10 == 0:
        print(i)
    plt.clf()
    fig = src.intentions.plot_pair(
        intentions_r,
        colors,
        reference,
        challenger,
        scores={reference: f"{wins:.0f} sur 100", challenger: f"{100-wins:.0f} sur 100"},
        num_points=i
    )

    filename = f"intentions-pairwise-{i}.png"
    plt.savefig(filename, bbox_inches="tight")
    filenames.append(filename)

with imageio.get_writer("intentions-pairwise.gif", mode="I") as writer:
    for filename in filenames:
        image = imageio.imread(filename)
        writer.append_data(image)

optimize("intentions-pairwise.gif", "optimized.gif")  # For creating a new one

for filename in set(filenames):
    os.remove(filename)

  • DONE Ajouter légende sur le graphe
  • DONE Placer points avec blue noise
  • DONE Compute location of vertical lines automatically

Harris 13.12

num_exprimes = int(2159 * (1-0.12))
precision = 0.5
resultats = {
    #"Arthaud": 0,
    "Poutou": 1,
    "Roussel": 2,
    "Mélenchon": 11,
    "Montebourg": 1,
    "Hidalgo": 4,
    "Jadot": 7,
    "Macron": 24,
    "Pécresse": 17,
    "Dupont-Aignan": 2,
    "Zemmour": 15,
    "Le Pen": 16,
    #"Lassalle": 0,
    #"Philippot": 0,
    #"Asselineau": 0,
}
assert sum(list(resultats.values())) == 100
results_r = np.array(list(resultats.values())) / 100
precision_r = precision / 100

with pm.Model() as multinomial:
    prior_intentions = np.array(
        [1, 2, 11, 1, 4, 7, 24, 17, 2, 15, 16]
    ) * 0.1
    p = pm.Dirichlet("intentions", prior_intentions, shape=(1,len(prior_intentions)))
    n = pm.Multinomial("respondants", num_exprimes, p, shape=(1, len(prior_intentions)))
    r = n / num_exprimes
    r_obs = pm.Uniform('observed', r-precision_r, r+precision_r, observed=results_r)

    trace = pm.sample()

figurek2eqJC.png

intentions_r = {k: trace['intentions'][:,0, i] for i,k in enumerate(resultats.keys())}
fig = src.intentions.plot(
    intentions_r,
    colors,
    "13/12/2021",
    "Harris interactive for Challenges",
    title="Intentions de vote au premier tour",
    sample_size=num_exprimes,
    base="Inscrits sur les listes électorales",
    logo_path="~/org/roam/images/logo.png"
)
plt.tight_layout()
plt.savefig(filename, dpi=600, bbox_inches="tight")
filename

figure9jWzjU.png

fig = src.intentions.plot_pair(
    intentions_r,
    colors,
    "Le Pen",
    "Pécresse",
    num_points=100
)
plt.savefig(filename, bbox_inches="tight")
filename

veCirZ.png

import matplotlib.pyplot as plt
import os
from pygifsicle import optimize
import imageio

reference = "Le Pen"
challenger = "Pécresse"

wins = np.ceil(100 * np.sum(intentions_r[reference]>intentions_r[challenger]) / len(intentions_r[reference]))

filenames = []
for i in range(1, 100):
    if i % 10 == 0:
        print(i)
    plt.clf()
    fig = src.intentions.plot_pair(
        intentions_r,
        colors,
        reference,
        challenger,
        scores={reference: f"{wins:.0f} sur 100", challenger: f"{100-wins:.0f} sur 100"},
        num_points=i
    )

    filename = f"intentions-pairwise-{i}.png"
    plt.savefig(filename, bbox_inches="tight")
    filenames.append(filename)

with imageio.get_writer("intentions-pairwise.gif", mode="I") as writer:
    for filename in filenames:
        image = imageio.imread(filename)
        writer.append_data(image)

optimize("intentions-pairwise.gif", "lepenpecresse.gif")  # For creating a new one

for filename in set(filenames):
    os.remove(filename)

file:///tmp/babel-B798aL/python-xgnGdG

import matplotlib.pyplot as plt
import os
from pygifsicle import optimize
import imageio

reference = "Le Pen"
challenger = "Zemmour"

wins = np.ceil(100 * np.sum(intentions_r[reference]>intentions_r[challenger]) / len(intentions_r[reference]))

filenames = []
for i in range(1, 100):
    if i % 10 == 0:
        print(i)
    plt.clf()
    fig = src.intentions.plot_pair(
        intentions_r,
        colors,
        reference,
        challenger,
        scores={reference: f"{wins:.0f} sur 100", challenger: f"{100-wins:.0f} sur 100"},
        num_points=i
    )

    filename = f"intentions-pairwise-{i}.png"
    plt.savefig(filename, bbox_inches="tight")
    filenames.append(filename)

with imageio.get_writer("intentions-pairwise.gif", mode="I") as writer:
    for filename in filenames:
        image = imageio.imread(filename)
        writer.append_data(image)

optimize("intentions-pairwise.gif", "zemmourlepen.gif")  # For creating a new one

for filename in set(filenames):
    os.remove(filename)

file:///tmp/babel-DLNTuj/python-81vzJb

Opinionway 15.12

Premier tour

num_exprimes = int(1470 * (1-0.16))
precision = 1.
resultats = {
    "Arthaud": 1,
    "Poutou": 1,
    "Roussel": 3,
    "Mélenchon": 9,
    "Montebourg": 2,
    "Hidalgo": 4,
    "Jadot": 8,
    "Macron": 24,
    "Pécresse": 17,
    "Dupont-Aignan": 2,
    "Zemmour": 12,
    "Le Pen": 16,
    "Lassalle": 1,
    #"Philippot": 0,
    #"Asselineau": 0,
}
assert sum(list(resultats.values())) == 100
results_r = np.array(list(resultats.values())) / 100
precision_r = precision / 100

with pm.Model() as multinomial:
    prior_intentions = np.array(
        [1, 1, 3, 9, 2, 4, 8, 24, 17, 2, 12, 16, 1]
    ) * 0.1
    p = pm.Dirichlet("intentions", prior_intentions, shape=(1,len(prior_intentions)))
    n = pm.Multinomial("respondants", num_exprimes, p, shape=(1, len(prior_intentions)))
    r = n / num_exprimes
    r_obs = pm.Uniform('observed', r-precision_r, r+precision_r, observed=results_r)

    trace = pm.sample()

figuresvVTFe.png

intentions_r = {k: trace['intentions'][:,0, i] for i,k in enumerate(resultats.keys())}
fig = src.intentions.plot(
    intentions_r,
    colors,
    "15/12/2021",
    "Opinionway pour Les Echos et Radio Classique",
    title="Intentions de vote au premier tour",
    sample_size=num_exprimes,
    base="Inscrits sur les listes électorales",
    logo_path="~/org/roam/images/logo.png"
)
plt.tight_layout()
plt.savefig(filename, dpi=200, bbox_inches="tight")
filename

figure2M59Zh.png

fig = src.intentions.plot_pair(
    intentions_r,
    colors,
    "Le Pen",
    "Pécresse",
    num_points=100
)
plt.savefig(filename, bbox_inches="tight")
filename

vAkUGN.png

fig = src.intentions.plot_pair(
    intentions_r,
    colors,
    "Jadot",
    "Mélenchon",
    num_points=100
)
plt.savefig(filename, bbox_inches="tight")
filename

Fnmff2.png

fig = src.intentions.plot_pair(
    intentions_r,
    colors,
    "Zemmour",
    "Mélenchon",
    num_points=100
)
plt.savefig(filename, bbox_inches="tight")
filename

U9UISJ.png

fig = src.intentions.plot_pair(
    intentions_r,
    colors,
    "Hidalgo",
    "Roussel",
    num_points=100
)
plt.savefig(filename, bbox_inches="tight")
filename

y8Aq8J.png

import matplotlib.pyplot as plt
import os
from pygifsicle import optimize
import imageio

reference = "Le Pen"
challenger = "Pécresse"

wins = np.ceil(100 * np.sum(intentions_r[reference]>intentions_r[challenger]) / len(intentions_r[reference]))

filenames = []
for i in range(1, 100):
    if i % 10 == 0:
        print(i)
    plt.clf()
    fig = src.intentions.plot_pair(
        intentions_r,
        colors,
        reference,
        challenger,
        scores={reference: f"{wins:.0f} sur 100", challenger: f"{100-wins:.0f} sur 100"},
        num_points=i
    )

    filename = f"intentions-pairwise-{i}.png"
    plt.savefig(filename, bbox_inches="tight")
    filenames.append(filename)

with imageio.get_writer("intentions-pairwise.gif", mode="I") as writer:
    for filename in filenames:
        image = imageio.imread(filename)
        writer.append_data(image)

optimize("intentions-pairwise.gif", "lepenpecresse.gif")  # For creating a new one

for filename in set(filenames):
    os.remove(filename)

import matplotlib.pyplot as plt
import os
from pygifsicle import optimize
import imageio

reference = "Jadot"
challenger = "Mélenchon"

wins = np.ceil(100 * np.sum(intentions_r[reference]>intentions_r[challenger]) / len(intentions_r[reference]))

filenames = []
for i in range(1, 100):
    if i % 10 == 0:
        print(i)
    plt.clf()
    fig = src.intentions.plot_pair(
        intentions_r,
        colors,
        reference,
        challenger,
        scores={reference: f"{wins:.0f} sur 100", challenger: f"{100-wins:.0f} sur 100"},
        num_points=i
    )

    filename = f"intentions-pairwise-{i}.png"
    plt.savefig(filename, bbox_inches="tight")
    filenames.append(filename)

with imageio.get_writer("intentions-pairwise.gif", mode="I") as writer:
    for filename in filenames:
        image = imageio.imread(filename)
        writer.append_data(image)

optimize("intentions-pairwise.gif", "jadotmelenchon.gif")  # For creating a new one

for filename in set(filenames):
    os.remove(filename)

import matplotlib.pyplot as plt
import os
from pygifsicle import optimize
import imageio

reference = "Zemmour"
challenger = "Mélenchon"

wins = np.ceil(100 * np.sum(intentions_r[reference]>intentions_r[challenger]) / len(intentions_r[reference]))

filenames = []
for i in range(1, 100):
    if i % 10 == 0:
        print(i)
    plt.clf()
    fig = src.intentions.plot_pair(
        intentions_r,
        colors,
        reference,
        challenger,
        scores={reference: f"{wins:.0f} sur 100", challenger: f"{100-wins:.0f} sur 100"},
        num_points=i
    )

    filename = f"intentions-pairwise-{i}.png"
    plt.savefig(filename, bbox_inches="tight")
    filenames.append(filename)

with imageio.get_writer("intentions-pairwise.gif", mode="I") as writer:
    for filename in filenames:
        image = imageio.imread(filename)
        writer.append_data(image)

optimize("intentions-pairwise.gif", "zemmourmelenchon.gif")  # For creating a new one

for filename in set(filenames):
    os.remove(filename)

import matplotlib.pyplot as plt
import os
from pygifsicle import optimize
import imageio

reference = "Hidalgo"
challenger = "Roussel"

wins = np.ceil(100 * np.sum(intentions_r[reference]>intentions_r[challenger]) / len(intentions_r[reference]))

filenames = []
for i in range(1, 100):
    if i % 10 == 0:
        print(i)
    plt.clf()
    fig = src.intentions.plot_pair(
        intentions_r,
        colors,
        reference,
        challenger,
        scores={reference: f"{wins:.0f} sur 100", challenger: f"{100-wins:.0f} sur 100"},
        num_points=i
    )

    filename = f"intentions-pairwise-{i}.png"
    plt.savefig(filename, bbox_inches="tight")
    filenames.append(filename)

with imageio.get_writer("intentions-pairwise.gif", mode="I") as writer:
    for filename in filenames:
        image = imageio.imread(filename)
        writer.append_data(image)

optimize("intentions-pairwise.gif", "hidalgoroussel.gif")  # For creating a new one

for filename in set(filenames):
    os.remove(filename)

file:///tmp/babel-SJxqMQ/python-u2NVUf

import matplotlib.pyplot as plt
import os
from pygifsicle import optimize
import imageio

reference = "Montebourg"
challenger = "Roussel"

wins = np.ceil(100 * np.sum(intentions_r[reference]>intentions_r[challenger]) / len(intentions_r[reference]))

filenames = []
for i in range(1, 100):
    if i % 10 == 0:
        print(i)
    plt.clf()
    fig = src.intentions.plot_pair(
        intentions_r,
        colors,
        reference,
        challenger,
        scores={reference: f"{wins:.0f} sur 100", challenger: f"{100-wins:.0f} sur 100"},
        num_points=i
    )

    filename = f"intentions-pairwise-{i}.png"
    plt.savefig(filename, bbox_inches="tight")
    filenames.append(filename)

with imageio.get_writer("intentions-pairwise.gif", mode="I") as writer:
    for filename in filenames:
        image = imageio.imread(filename)
        writer.append_data(image)

optimize("intentions-pairwise.gif", "montebourgroussel.gif")  # For creating a new one

for filename in set(filenames):
    os.remove(filename)

file:///tmp/babel-SJxqMQ/python-q6atyq

Second tour

nspp = 40
num_exprimes = int(1470 * (1-nspp/100))
precision = 1.
resultats = {
    "Macron": 54,
    "Pécresse": 46,
}
assert sum(list(resultats.values())) == 100
results_r = np.array(list(resultats.values())) / 100
precision_r = precision / 100
with pm.Model() as multinomial:
    prior_intentions = np.array(
        [50, 50]
    ) * 0.1
    p = pm.Dirichlet("intentions", prior_intentions, shape=(1,len(prior_intentions)))
    r_obs = pm.Dirichlet("respondants", num_exprimes * p, observed=results_r)
    trace = pm.sample()

figureWCEwG9.png

intentions_decided_r = {k: trace['intentions'][:,0, i] for i,k in enumerate(resultats.keys())}
fig = src.intentions.plot_pair(
    intentions_decided_r,
    colors,
    "Macron",
    "Pécresse",
    title="Différence d'intentions de vote au 2nd tour",
    num_points=100
)
plt.savefig(filename, bbox_inches="tight")
filename

dl5xun.png

Ok mais il y a 40% d'indécis! Que se passe-t-il si on les réparti aléatoirement?

import matplotlib.pyplot as plt
import os
from pygifsicle import optimize
import imageio

num_indecis = 1470 * nspp / 100

rng = np.random.default_rng()
transition_i = rng.dirichlet(np.ones(2), size=4000).T

intentions = {
    c: num_exprimes * intentions_decided_r[c] for c in resultats
}
intentions_values = np.stack(intentions.values())
intentions_randomized = intentions_values + num_indecis * transition_i
intentions_r = {k: intentions_randomized[i]/1470 for i,k in enumerate(resultats.keys())}

reference = "Macron"
challenger = "Pécresse"

wins = np.ceil(100 * np.sum(intentions_r[reference]>intentions_r[challenger]) / len(intentions_r[reference]))

filenames = []
for i in range(1, 100):
    if i % 10 == 0:
        print(i)
    plt.clf()
    fig = src.intentions.plot_pair(
        intentions_r,
        colors,
        reference,
        challenger,
        title="Si tous les indécis choisissaient au hasard",
        scores={reference: f"{wins:.0f} sur 100", challenger: f"{100-wins:.0f} sur 100"},
        num_points=i
    )

    filename = f"intentions-pairwise-{i}.png"
    plt.savefig(filename, bbox_inches="tight")
    filenames.append(filename)

with imageio.get_writer("intentions-pairwise.gif", mode="I") as writer:
    for filename in filenames:
        image = imageio.imread(filename)
        writer.append_data(image)

optimize("intentions-pairwise.gif", "secontourhalfrandomized.gif")  # For creating a new one

for filename in set(filenames):
    os.remove(filename)

file:///tmp/babel-4wR0Ij/python-ALLVQT

Cluster17 15.12

[2021-12-19 Sun]

num_exprimes = 1446
precision = .25
resultats = {
    "Arthaud": .5,
    "Poutou": 1,
    "Roussel": 2,
    "Mélenchon": 13,
    "Montebourg": 1,
    "Hidalgo": 3,
    "Jadot": 5,
    "Macron": 22,
    "Pécresse": 18,
    "Dupont-Aignan": 1.5,
    "Zemmour": 15,
    "Le Pen": 15,
    "Lassalle": 1,
    "Philippot": 1,
    "Asselineau": 1,
}
assert sum(list(resultats.values())) == 100

On utilise le modèle Dirichlet-Multinomial avec une distribution uniforme pour modéliser l'arrondi :

results_r = np.array(list(resultats.values())) / 100
precision_r = precision / 100

with pm.Model() as multinomial:
    prior_intentions = np.array(list(resultats.values())) * 0.1
    p = pm.Dirichlet("intentions", prior_intentions, shape=(1,len(prior_intentions)))
    n = pm.Multinomial("respondants", num_exprimes, p, shape=(1, len(prior_intentions)))
    r = n / num_exprimes
    r_obs = pm.Uniform('observed', r-precision_r, r+precision_r, observed=results_r)

    trace = pm.sample()

On vérifie que tout s'est bien passé :

Puis on trace les intentions de vote

intentions_r = {k: trace['intentions'][:,0, i] for i,k in enumerate(resultats.keys())}
fig = src.intentions.plot(
    intentions_r,
    colors,
    "15/12/2021",
    "Cluster17",
    title="Intentions de vote au premier tour",
    sample_size=num_exprimes,
    base="Inscrits sur les listes électorales",
    logo_path="~/org/roam/images/logo.png"
)
plt.tight_layout()
plt.savefig(filename, dpi=200, bbox_inches="tight")
filename

figureKjkeQG.png

Pour la première fois Mélenchon a une chance non-nulle d'arriver devant MLP :

fig = src.intentions.plot_pair(
    intentions_r,
    colors,
    "Le Pen",
    "Mélenchon",
    num_points=100
)
plt.savefig(filename, bbox_inches="tight")
filename

Sous forme de gif animé :

import matplotlib.pyplot as plt
import os
from pygifsicle import optimize
import imageio

reference = "Le Pen"
challenger = "Mélenchon"

wins = np.ceil(100 * np.sum(intentions_r[reference]>intentions_r[challenger]) / len(intentions_r[reference]))

filenames = []
for i in range(1, 100):
    if i % 10 == 0:
        print(i)
    plt.clf()
    fig = src.intentions.plot_pair(
        intentions_r,
        colors,
        reference,
        challenger,
        scores={reference: f"{wins:.0f} sur 100", challenger: f"{100-wins:.0f} sur 100"},
        num_points=i
    )

    filename = f"intentions-pairwise-{i}.png"
    plt.savefig(filename, bbox_inches="tight")
    filenames.append(filename)

with imageio.get_writer("intentions-pairwise.gif", mode="I") as writer:
    for filename in filenames:
        image = imageio.imread(filename)
        writer.append_data(image)

optimize("intentions-pairwise.gif", "lepenmelenchon.gif")  # For creating a new one

for filename in set(filenames):
    os.remove(filename)

Et l'on trace les comparaisons les plus pertinentes compte-tenu de la proximité des candidats en terme de score, ou de la situation politique (primaire à gauche ici)

fig = src.intentions.plot_pair(
    intentions_r,
    colors,
    "Hidalgo",
    "Jadot",
    num_points=100
)
plt.savefig(filename, bbox_inches="tight")
filename
fig = src.intentions.plot_pair(
    intentions_r,
    colors,
    "Le Pen",
    "Zemmour",
    num_points=100
)
plt.savefig(filename, bbox_inches="tight")
filename
fig = src.intentions.plot_pair(
    intentions_r,
    colors,
    "Le Pen",
    "Pécresse",
    num_points=100
)
plt.savefig(filename, bbox_inches="tight")
filename
fig = src.intentions.plot_pair(
    intentions_r,
    colors,
    "Mélenchon",
    "Jadot",
    num_points=100
)
plt.savefig(filename, bbox_inches="tight")
filename

Ipsos

[2021-12-19 Sun]

num_exprimes = int(10928 * 0.61 * (1-.06))
precision = .25
resultats = {
    "Poutou": 1.5,
    "Arthaud": 0.5,
    "Roussel": 2,
    "Mélenchon": 8.5,
    "Montebourg": 1.5,
    "Hidalgo": 4.5,
    "Jadot": 8.5,
    "Macron": 24,
    "Pécresse": 17,
    "Dupont-Aignan": 2,
    "Zemmour": 14.5,
    "Le Pen": 14.5,
    "Lassalle": 1,
}
assert sum(list(resultats.values())) == 100

On utilise le modèle Dirichlet-Multinomial avec une distribution uniforme pour modéliser l'arrondi :

results_r = np.array(list(resultats.values())) / 100
precision_r = precision / 100

with pm.Model() as multinomial:
    prior_intentions = np.array(list(resultats.values())) * 0.1
    p = pm.Dirichlet("intentions", prior_intentions, shape=(1,len(prior_intentions)))
    n = pm.Dirichlet("respondants", num_exprimes * p, observed=results_r)

    trace = pm.sample()

On vérifie que tout s'est bien passé :

figureSp0UjM.png

intentions_r = {k: trace['intentions'][:,0, i] for i,k in enumerate(resultats.keys())}
fig = src.intentions.plot(
    intentions_r,
    colors,
    "13/12/2021",
    "Ipsos",
    title="Intentions de vote au premier tour",
    sample_size=num_exprimes,
    base="Certains d'aller voter (61% des interrogés)",
    logo_path="~/org/roam/images/logo.png"
)
plt.tight_layout()
plt.savefig(filename, dpi=200, bbox_inches="tight")
filename

figureGsopf6.png

La différence avec le sondage de Cluster17 est flagrante et souligne la nécessité d'avoir un agrégateur. C'est je pense cocasse pour la plupart des gens de voir des résultats très différents le matin et le soir de la même journée.

Ifop

[2021-12-20 Mon]

num_exprimes = 1017
precision = .25
resultats = {
    "Poutou": 0.5,
    "Roussel": 3,
    "Mélenchon": 9.5,
    "Montebourg": 1,
    "Hidalgo": 4.5,
    "Jadot": 7.5,
    "Macron": 25.5,
    "Pécresse": 18,
    "Dupont-Aignan": 2,
    "Zemmour": 12,
    "Le Pen": 16,
    "Lassalle": 0.5,
}
assert sum(list(resultats.values())) == 100
results_r = np.array(list(resultats.values())) / 100
precision_r = precision / 100

with pm.Model() as multinomial:
    prior_intentions = np.array(list(resultats.values())) * 0.1
    p = pm.Dirichlet("intentions", prior_intentions, shape=(1,len(prior_intentions)))
    n = pm.Multinomial("respondants", num_exprimes, p, shape=(1, len(prior_intentions)))
    r = n / num_exprimes
    r_obs = pm.Uniform('observed', r-precision_r, r+precision_r, observed=results_r)

    trace = pm.sample()

figureWs6u38.png

intentions_r = {k: trace['intentions'][:,0, i] for i,k in enumerate(resultats.keys())}
fig = src.intentions.plot(
    intentions_r,
    colors,
    "15/12/2021",
    "Ifop",
    title="Intentions de vote au premier tour",
    sample_size=num_exprimes,
    base="Inscrits sur les listes électorales",
    logo_path="~/org/roam/images/logo.png"
)
plt.tight_layout()
plt.savefig(filename, dpi=200, bbox_inches="tight")
filename

figuretjIX6i.png

Plusieurs duels sont intéressants:

  • Zemmour / Le Pen
  • Le Pen / Pécresse
  • Mélenchon / Zemmour
  • Jadot / Mélenchon
  • Hidalgo / Roussel
fig = src.intentions.plot_pair(
    intentions_r,
    colors,
    "Zemmour",
    "Le Pen",
    num_points=100
)
plt.savefig(filename, bbox_inches="tight")
filename

DtxabL.png

fig = src.intentions.plot_pair(
    intentions_r,
    colors,
    "Pécresse",
    "Le Pen",
    num_points=100
)
plt.savefig(filename, bbox_inches="tight")
filename

JPKuR4.png

fig = src.intentions.plot_pair(
    intentions_r,
    colors,
    "Zemmour",
    "Mélenchon",
    num_points=100
)
plt.savefig(filename, bbox_inches="tight")
filename

mUHqqS.png

import matplotlib.pyplot as plt
import os
from pygifsicle import optimize
import imageio

reference = "Zemmour"
challenger = "Mélenchon"

wins = np.ceil(100 * np.sum(intentions_r[reference]>intentions_r[challenger]) / len(intentions_r[reference]))

filenames = []
for i in range(1, 100):
    if i % 10 == 0:
        print(i)
    plt.clf()
    fig = src.intentions.plot_pair(
        intentions_r,
        colors,
        reference,
        challenger,
        scores={reference: f"{wins:.0f} sur 100", challenger: f"{100-wins:.0f} sur 100"},
        num_points=i
    )

    filename = f"intentions-pairwise-{i}.png"
    plt.savefig(filename, bbox_inches="tight")
    filenames.append(filename)

with imageio.get_writer("intentions-pairwise.gif", mode="I") as writer:
    for filename in filenames:
        image = imageio.imread(filename)
        writer.append_data(image)

optimize("intentions-pairwise.gif", "zemmourmelenchon.gif")  # For creating a new one

for filename in set(filenames):
    os.remove(filename)

fig = src.intentions.plot_pair(
    intentions_r,
    colors,
    "Jadot",
    "Mélenchon",
    num_points=100
)
plt.savefig(filename, bbox_inches="tight")
filename

b6DvWw.png

fig = src.intentions.plot_pair(
    intentions_r,
    colors,
    "Jadot",
    "Hidalgo",
    num_points=100
)
plt.savefig(filename, bbox_inches="tight")
filename
fig = src.intentions.plot_pair(
    intentions_r,
    colors,
    "Roussel",
    "Hidalgo",
    num_points=100
)
plt.savefig(filename, bbox_inches="tight")
filename

hfrr4Z.png

Elabe

[2021-12-22 Wed]

num_exprimes = int(1351 * (1-.12) * 0.8)
precision = .5
resultats = {
    "Arthaud": 1,
    "Poutou": 1,
    "Roussel": 1,
    "Mélenchon": 11,
    "Montebourg": 2,
    "Hidalgo": 3,
    "Jadot": 5,
    "Macron": 26,
    "Pécresse": 17,
    "Lassalle": 1,
    "Dupont-Aignan": 2,
    "Philippot": 1,
    "Le Pen": 16,
    "Zemmour": 13,
}
assert sum(list(resultats.values())) == 100
results_r = np.array(list(resultats.values())) / 100
precision_r = precision / 100

with pm.Model() as multinomial:
    prior_intentions = np.array(list(resultats.values())) * 0.1
    p = pm.Dirichlet("intentions", prior_intentions, shape=(1,len(prior_intentions)))
    n = pm.Multinomial("respondants", num_exprimes, p, shape=(1, len(prior_intentions)))
    r = n / num_exprimes
    r_obs = pm.Uniform('observed', r-precision_r, r+precision_r, observed=results_r)

    trace = pm.sample()

figure8KYFlv.png

intentions_r = {k: trace['intentions'][:,0, i] for i,k in enumerate(resultats.keys())}
fig = src.intentions.plot(
    intentions_r,
    colors,
    "20/12/2021",
    "Elabe",
    title="Intentions de vote au premier tour",
    sample_size=num_exprimes,
    base="Comptant aller voter (80% des inscrits)",
    logo_path="~/org/roam/images/logo.png"
)
plt.tight_layout()
plt.savefig(filename, dpi=200, bbox_inches="tight")
filename

figureTSxadC.png

Duals intéressants:

  • Pécresse / Le Pen
  • Le Pen / Zemmour
  • Mélenchon / Zemmour
  • Primaire de la gauche

Qualification au 2nd tour: Le Pen / Pécresse

fig = src.intentions.plot_pair(
    intentions_r,
    colors,
    "Pécresse",
    "Le Pen",
    num_points=100
)
plt.savefig(filename, bbox_inches="tight")
filename

WMhEfe.png

import matplotlib.pyplot as plt
import os
from pygifsicle import optimize
import imageio

reference = "Pécresse"
challenger = "Le Pen"

wins = np.ceil(100 * np.sum(intentions_r[reference]>intentions_r[challenger]) / len(intentions_r[reference]))

filenames = []
for i in range(1, 100):
    if i % 10 == 0:
        print(i)
    plt.clf()
    fig = src.intentions.plot_pair(
        intentions_r,
        colors,
        reference,
        challenger,
        scores={reference: f"{wins:.0f} sur 100", challenger: f"{100-wins:.0f} sur 100"},
        num_points=i
    )

    filename = f"intentions-pairwise-{i}.png"
    plt.savefig(filename, bbox_inches="tight")
    filenames.append(filename)

with imageio.get_writer("intentions-pairwise.gif", mode="I") as writer:
    for filename in filenames:
        image = imageio.imread(filename)
        writer.append_data(image)

optimize("intentions-pairwise.gif", "lepenpecresse.gif")  # For creating a new one

for filename in set(filenames):
    os.remove(filename)

Le Pen vs Zemmour

fig = src.intentions.plot_pair(
    intentions_r,
    colors,
    "Zemmour",
    "Le Pen",
    num_points=100
)
plt.savefig(filename, bbox_inches="tight")
filename

gIJxoR.png

Mélenchon vs Zemmour

fig = src.intentions.plot_pair(
    intentions_r,
    colors,
    "Zemmour",
    "Mélenchon",
    num_points=100
)
plt.savefig(filename, bbox_inches="tight")
filename

QfkrGG.png

import matplotlib.pyplot as plt
import os
from pygifsicle import optimize
import imageio

reference = "Zemmour"
challenger = "Mélenchon"

wins = np.ceil(100 * np.sum(intentions_r[reference]>intentions_r[challenger]) / len(intentions_r[reference]))

filenames = []
for i in range(1, 100):
    if i % 10 == 0:
        print(i)
    plt.clf()
    fig = src.intentions.plot_pair(
        intentions_r,
        colors,
        reference,
        challenger,
        scores={reference: f"{wins:.0f} sur 100", challenger: f"{100-wins:.0f} sur 100"},
        num_points=i
    )

    filename = f"intentions-pairwise-{i}.png"
    plt.savefig(filename, bbox_inches="tight")
    filenames.append(filename)

with imageio.get_writer("intentions-pairwise.gif", mode="I") as writer:
    for filename in filenames:
        image = imageio.imread(filename)
        writer.append_data(image)

optimize("intentions-pairwise.gif", "zemmourmelenchon.gif")  # For creating a new one

for filename in set(filenames):
    os.remove(filename)

Primaire de la gauche

fig = src.intentions.plot_pair(
    intentions_r,
    colors,
    "Jadot",
    "Mélenchon",
    num_points=100
)
plt.savefig(filename, bbox_inches="tight")
filename

7r1AlW.png

fig = src.intentions.plot_pair(
    intentions_r,
    colors,
    "Jadot",
    "Hidalgo",
    num_points=100
)
plt.savefig(filename, bbox_inches="tight")
filename

Rv87PI.png

Janvier 2022

Les rollings on fait leur entrée. Ils consistent à interroger chaque un nombre donné d'inscrits sur les listes électorales, et publier les résultats des X derniers jours. Les échantillons n'étant pas indépendants nous ne pouvons pas considérer les points successives à moins de modéliser les cohortes:

 J-4   J-3   J-2 

^ ^ ^ ^ ^

Ici nous ne considérerons que les échantillons indépendants, c'est-à-dire un point tous les X jours pour avoir des cohortes indépendantes.

Cluster17

num_exprimes = 2558
precision = .5
resultats = {
    "Arthaud": 0.5,
    "Poutou": 1.5,
    "Roussel": 2,
    "Mélenchon": 12.5,
    "Montebourg": 1,
    "Hidalgo": 2,
    "Jadot": 4.5,
    "Taubira": 5.5,
    "Macron": 22.5,
    "Pécresse": 13,
    "Lassalle": 1,
    "Asselineau": 1.5,
    "Dupont-Aignan": 2.5,
    "Philippot": 1.5,
    "Le Pen": 14.5,
    "Zemmour": 14,
}
print(sum(list(resultats.values())))
assert sum(list(resultats.values())) == 100
results_r = np.array(list(resultats.values())) / 100
precision_r = precision / 100

with pm.Model() as multinomial:
    prior_intentions = np.array(list(resultats.values())) * 0.1
    p = pm.Dirichlet("intentions", prior_intentions, shape=(1,len(prior_intentions)))
    n = pm.Multinomial("respondants", num_exprimes, p, shape=(1, len(prior_intentions)))
    r = n / num_exprimes
    r_obs = pm.Uniform('observed', r-precision_r, r+precision_r, observed=results_r)

    trace = pm.sample()

figureojReTN.png

intentions_r = {k: trace['intentions'][:,0, i] for i,k in enumerate(resultats.keys())}
fig = src.intentions.plot(
    intentions_r,
    colors,
    "15/01/2022",
    "Cluster17 pour Marianne",
    title="Intentions de vote au premier tour",
    sample_size=num_exprimes,
    base="Inscrits",
    logo_path="~/org/roam/images/logo.png"
)
plt.tight_layout()
plt.savefig(filename, dpi=200, bbox_inches="tight")
filename

figureFZzMy7.png

figureqJ3sa8.png

Mélenchon au 2nd tour?

fig = src.intentions.plot_pair(
    intentions_r,
    colors,
    "Mélenchon",
    "Pécresse",
    num_points=100
)
plt.savefig(filename, bbox_inches="tight")
filename

MTcenv.png

fig = src.intentions.plot_pair(
    intentions_r,
    colors,
    "Mélenchon",
    "Le Pen",
    num_points=100,
    seed=1
)
plt.savefig(filename, bbox_inches="tight")
filename

hyd8Bw.png

fig = src.intentions.plot_pair(
    intentions_r,
    colors,
    "Mélenchon",
    "Zemmour",
    num_points=100,
    seed=1
)
plt.savefig(filename, bbox_inches="tight")
filename

NenJUu.png

La gauche

A droite tout le monde est dans un mouchoir de poche.

fig = src.intentions.plot_pair(
    intentions_r,
    colors,
    "Mélenchon",
    "Taubira",
    num_points=100,
    seed=1
)
plt.savefig(filename, bbox_inches="tight")
filename

lKy2Zv.png

fig = src.intentions.plot_pair(
    intentions_r,
    colors,
    "Taubira",
    "Jadot",
    num_points=100,
    seed=1
)
plt.savefig(filename, bbox_inches="tight")
filename

hErGEM.png

fig = src.intentions.plot_pair(
    intentions_r,
    colors,
    "Hidalgo",
    "Roussel",
    num_points=100,
    seed=0
)
plt.savefig(filename, bbox_inches="tight")
filename

mzWcQ5.png

TODO Plot that shows the probability to be 1st, 2nd, 3rd, etc for each candidate