This morning we had homemade popovers for breakfast - they were great!! So great that I forgot to take a picture until after I finished one and had ripped open the second.
I’m not making any promises, but if you’re at our house in the winter months and you ask nicely, you too could enjoy this treat.
December 30, 2018
December 21, 2018
"...if you want to learn something, I can't stop you. If you don't...I cannot teach you."
As I was catching up on some podcasts after finals week, the episode of Freakonomics called "Where Does Creativity Come From? (And Why Do Schools Kill It Off)?" which had the following line from legendary trumpeter Wynton Marsalis: "...if you want to learn something, I can't stop you. If you don't want to learn it, I cannot teach you." Whoa! That is so true. I can't count the number of times that I have students in my class who are there because they have to fulfill a science credit (for various reasons) and have very little interest in the physics I am trying to discuss. I think that I have tried for years to foster a classroom environment where learning can happen, but I sometimes forget that students have to WANT to learn what I am offering to teach.
Following my continuing philosophy to not hide anything in terms of pedagogy, learning, or teaching from my students, I plan to hang some printouts of these images I made and have them in the classroom as a reminder that the choice to engage in learning is solely up to the learner.
After hearing this episode, I thought for sure that some other teacher had discovered this great podcast episode and the Marsalis line before I did. I did a quick search and the only post I could find was this one on Medium from Shaun Mosley. I like how he tied the process of developing creativity and learning to the differences between extrinsic and intrinsic motivations. It is something I have certainly thought a lot about as I have planned my classes and made the shift to Standards-Based Assessment and Reporting.
To all the teachers out there: if you have a chance to listen to the podcast episode, I'd love to know what you think about it and what you are doing in your class to engage learners in creativity. Let me know!
Photo source/credit: Eric Delmar public domain image from Wikimedia Commons.
Images on this page are licensed under a Creative Commons Attribution-NoDerivatives 4.0 International License.
August 28, 2018
Some observations of doing a bit of data analysis with DBSCAN and pandas in a Jupyter notebook
Here is a Jupyter notebook I was using today to parse the classifications from the Steelpan Vibrations project. I'm leaving some of the notes here as a reminder to myself for the future. (I learned how to put the Jupyter notebook into the blog from this page.)
I really want to share this because in all my reading on using DBSCAN to do cluster analysis, I had a hard time finding any page online that was describing how the coordinates of the points identified in a cluster could be paired with matched data from the larger (original) data set. When I found the solution (see link in the comments between cells below) it was really obvious, but it was painful not knowing even how to google for what I was looking for.
Function to do the cluster identification with DBSCAN:
In [31]:
def dbscan(crds):
bad_xy = [] #might need to change this
X = np.array(crds)
db = DBSCAN(eps=18, min_samples=3).fit(X)
core_samples_mask = np.zeros_like(db.labels_, dtype=bool)
core_samples_mask[db.core_sample_indices_] = True
labels = db.labels_
n_clusters_ = len(set(labels)) - (1 if -1 in labels else 0)
unique_labels = set(labels)
colors = plt.cm.Spectral(np.linspace(0, 1, len(unique_labels)))
for k, col in zip(unique_labels, colors):
if k == -1:
# Black used for noise.
col = 'k'
class_member_mask = (labels == k)
# These are the definitely "good" xy values.
xy = X[class_member_mask & core_samples_mask]
plt.plot(xy[:, 0], xy[:, 1], 'o', markerfacecolor=col,
markeredgecolor='k', markersize=14)
#print("\n Good? xy = ",xy)
#print("X = ",X)
# These are the "bad" xy values. Note that some maybe-bad and maybe-good are included here.
xy = X[class_member_mask & ~core_samples_mask]
plt.plot(xy[:, 0], xy[:, 1], 'o', markerfacecolor=col,
markeredgecolor='k', markersize=6)
#print("\n Bad? xy = ",xy)
bad_xy.append(xy)
plt.title('Estimated number of clusters: %d' % n_clusters_)
plt.xlim(0, 512)
plt.ylim(0, 384)
clusters = [X[labels == i] for i in range(n_clusters_)]
#print(clusters)
#print(db.labels_)
return clusters, labels
Import the classifications into a pandas DataFrame. I'm using header=None because there were no headings in the csv file:
In [32]:
import pandas as pd
df=pd.read_csv('averages-strike1.csv', sep=',',header=None)
This is the main part of the code that ends up calling the dbscan function at the end:
In [34]:
from matplotlib.patches import Ellipse
import matplotlib.pyplot as plt
import matplotlib.cm as cm
import matplotlib.colors as col
cmap_1 = cm.ScalarMappable(col.Normalize(1, 11, cm.gist_rainbow))
import numpy as np
from sklearn.cluster import DBSCAN
x_val = []
y_val = []
frng = []
crds = []
ell = []
for centers in df.values:
x_val.append(centers[0])
y_val.append(centers[1])
frng.append(centers[3])
crds.append([centers[0], centers[1]])
ell.append(Ellipse(xy=[centers[0], centers[1]], width=centers[4], height=centers[5], angle=centers[6]))
centers_raw = {'XVal': x_val,
'YVal': y_val,
'Fringe': frng}
centers_df = pd.DataFrame(centers_raw, columns=['XVal', 'YVal', 'Fringe'])
plt.figure(0)
plt.scatter(centers_df.XVal, centers_df.YVal, s=20, c=cmap_1.to_rgba(centers_df.Fringe), alpha=.6)
plt.xlim(0, 512)
plt.ylim(0, 384)
#plt.title('Subject id = %s'%(coords_x[0][2]))
plt.show()
#print(crds)
plt.figure(1)
clusters, labels = dbscan(crds)
/Users/amorriso/anaconda/lib/python3.6/site-packages/matplotlib/lines.py:1206: FutureWarning: comparison to `None` will result in an elementwise object comparison in the future. if self._markerfacecolor != fc:
Check the DataFrame once, and then check it again after renaming the columns:
In [30]:
df[:15]
Out[30]:
x | y | filename | fringe | rx | ry | angle | cluster | |
---|---|---|---|---|---|---|---|---|
0 | 107.716469 | 213.009577 | 06240907_proc_00254.png | 1.000000 | 85.034929 | 67.943204 | -47.505782 | 0 |
1 | 114.698967 | 213.766703 | 06240907_proc_00258.png | 1.333333 | 67.924027 | 67.389913 | -51.659952 | 0 |
2 | 111.190662 | 218.375451 | 06240907_proc_00270.png | 0.714286 | 67.455082 | 57.088226 | -63.335567 | 0 |
3 | 113.800339 | 223.653310 | 06240907_proc_00276.png | 8.333333 | 86.160744 | 73.501320 | -73.822837 | 0 |
4 | 88.625250 | 218.599081 | 06240907_proc_00279.png | 7.200000 | 119.292404 | 107.265178 | -76.700412 | 0 |
5 | 81.290269 | 220.570363 | 06240907_proc_00281.png | 7.333333 | 115.024131 | 109.400213 | -91.981419 | 0 |
6 | 81.476925 | 215.762886 | 06240907_proc_00282.png | 6.166667 | 115.916690 | 111.225947 | -51.426068 | 0 |
7 | 72.502562 | 219.822452 | 06240907_proc_00292.png | 7.200000 | 115.302500 | 108.964856 | -54.631973 | 0 |
8 | 71.396729 | 213.876289 | 06240907_proc_00295.png | 7.000000 | 132.873660 | 114.236231 | -88.764995 | 0 |
9 | 73.012500 | 206.005209 | 06240907_proc_00299.png | 10.000000 | 116.456652 | 113.427691 | -82.312357 | 0 |
10 | 62.431250 | 206.850000 | 06240907_proc_00301.png | 10.000000 | 104.117715 | 88.929126 | -2.347311 | 0 |
11 | 141.296875 | 252.166667 | 06240907_proc_00301.png | 3.666667 | 55.919208 | 29.365025 | 62.916449 | -1 |
12 | 71.331521 | 212.055188 | 06240907_proc_00306.png | 8.166667 | 122.378310 | 99.126123 | -52.857932 | 0 |
13 | 71.714899 | 208.812385 | 06240907_proc_00307.png | 8.666667 | 107.007787 | 98.573020 | 11.509674 | 0 |
14 | 286.998737 | 170.834790 | 06240907_proc_00307.png | 1.200000 | 34.312887 | 32.881617 | -0.016536 | 1 |
In [7]:
labels
Out[7]:
array([0, 0, 0, ..., 0, 1, 3])
These next two lines are the magic that connect the clusters identified by DBSCAN with the original classifications so that we can plot the fringe measurements for each cluster over time.
Finally figured this out by reading the question posted here: https://datascience.stackexchange.com/questions/29587/python-clustering-and-labels
Finally figured this out by reading the question posted here: https://datascience.stackexchange.com/questions/29587/python-clustering-and-labels
In [8]:
cluster=pd.Series(labels)
df["cluster"] = cluster
Rename the DataFrame columns:
In [10]:
df = df.rename(index=str, columns={0: "x", 1: "y",2:"filename", 3:"fringe",4:"rx", 5:"ry",6:"angle"})
Assign each cluster its own variable:
In [27]:
cluster0 = df[df['cluster']==0]
cluster1 = df[df['cluster']==1]
cluster2 = df[df['cluster']==2]
cluster3 = df[df['cluster']==3]
cluster4 = df[df['cluster']==4]
cluster5 = df[df['cluster']==5]
cluster6 = df[df['cluster']==6]
cluster7 = df[df['cluster']==7]
Make plots!!!
In [29]:
plt.scatter(cluster0.index, cluster0.fringe)
plt.show()
In [36]:
plt.scatter(cluster1.index, cluster1.fringe)
plt.show()
In [37]:
plt.scatter(cluster2.index, cluster2.fringe)
plt.show()
In [38]:
plt.scatter(cluster3.index, cluster3.fringe)
plt.show()
In [39]:
plt.scatter(cluster4.index, cluster4.fringe)
plt.show()
In [43]:
plt.scatter(cluster5.index, cluster5.fringe)
plt.show()
Subscribe to:
Posts (Atom)