Sampling
sample() is an inbuilt function of random module in Python that returns a particular length list of items chosen from the sequence i.e. list, tuple, string or set. Used for random sampling without replacement.
Sampling is the process of selecting a random number of units from a known population. It allows obtaining information and drawing conclusions about a population based on the statistics of such units (i.e. the sample), without the need of having to study the entire population.
Sampling is performed for multiple reasons, including:
- Cases where it is impossible to study the entire population due to its size
- Cases where the sampling process involves samples destructive testing
- Cases where there are time and costs constrains.
- At first we need a Bayesian network.
- Then we have to import pomegranate and counter.
- we need to generate_sample functions in this function.
- In the function, we start a loop over all states, Assuming topological order.
- if we have a non-root node. sample conditional on parents then "sample[state.name] = state.distribution.sample(parent_values=parents)" .
- Otherwise "sample[state.name] =state.distribution.sample()".
- Return the sample.
- Reject the sample.
- compute the sample.
My name's Happy khatun. I am a student of city university. This blog is the easiest way to learn python programming in Bangladesh. This course is conducted in City University by our most honorable teacher Nuruzzaman Faruqui.
In this blog you will find every single line explanation of python code. Here every person can gather knowledge about python programming. Also one will be overcome of his fear regarding python programming.
Here we will discuss a sampling problem. The problem is :
from pomegranate import *
# Rain node has no parent
rain = Node(DiscreteDistribution({
"none": 0.7,
"light": 0.2,
"heavy": 0.1
}), name="rain")
# Maintenance node is conditional on rain for that we use
Conditional ProbabilityTable
maintenance = Node(ConditionalProbabilityTable([
["none", "yes", 0.4],
["none", "no", 0.6],
["light", "yes", 0.2],
["light", "no", 0.8],
["heavy", "yes", 0.1],
["heavy", "no", 0.9]
], [rain.distribution]), name="maintenance")
# Train node is conditional on rain and maintenance
train = Node(ConditionalProbabilityTable([
["none", "yes", "on time", 0.8],
["none", "yes", "delayed", 0.2],
["none", "no", "on time", 0.9],
["none", "no", "delayed", 0.1],
["light", "yes", "on time", 0.6],
["light", "yes", "delayed", 0.4],
["light", "no", "on time", 0.7],
["light", "no", "delayed", 0.3],
["heavy", "yes", "on time", 0.4],
["heavy", "yes", "delayed", 0.6],
["heavy", "no", "on time", 0.5],
["heavy", "no", "delayed", 0.5],
], [rain.distribution, maintenance.distribution]), name="train")
# Appointment node is conditional on train
appointment = Node(ConditionalProbabilityTable([
["on time", "attend", 0.9],
["on time", "miss", 0.1],
["delayed", "attend", 0.6],
["delayed", "miss", 0.4]
], [train.distribution]), name="appointment")
# Now create a Bayesian Network and add states
model = BayesianNetwork()
model.add_states(rain, maintenance, train, appointment)
# Add edges connecting nodes
model.add_edge(rain, maintenance)
model.add_edge(rain, train)
model.add_edge(maintenance, train)
model.add_edge(train, appointment)
# Finalize model
model.bake()
import pomegranate
from collections import Counter
from model import model
def generate_sample():
# Mapping of random variable name to sample generated
sample = {}
# Mapping of distribution to sample generated
parents = {}
# Loop over all states, assuming topological order
for state in model.states:
# If we have a non-root node, sample conditional on parents
if isinstance(state.distribution, pomegranate.ConditionalProbabilityTable):
sample[state.name] = state.distribution.sample(parent_values=parents)
# Otherwise, just sample from the distribution alone
else:
sample[state.name] = state.distribution.sample()
# Keep track of the sampled value in the parents mapping
parents[state.distribution] = sample[state.name]
# Return generated sample
return sample
# Rejection sampling
# Compute distribution of Appointment given that train is delayed
N = 10000
data = []
for i in range(N):
sample = generate_sample()
if sample["train"] == "delayed":
data.append(sample["appointment"])
print(Counter(data))


No comments:
Post a Comment