Regular expressions for renaming files

I have a list of files I want to rename, but want to retain a number, date, subject, and filetype.

The list is formatted as:


format is  (Deuce_Plays_-_(1)_-_(mm-dd-yyyy)_-_(3).mp3

Key steps:

  1. recognizing non-alphanumeric characters in regular expressions
  2. keeping episode number (1) and title (3)
  3. reformatting date to mm-dd-yy


Posted in Uncategorized | Leave a comment

Multivariate regression for pre-post observational data

As part of a consulting project, an investigator came for assistance with a preliminary analysis of data from an observational study on change in health-related quality of life (HRQL) from pre- to post-hospitalization for an acute condition. Analysis of the data presented several issues, likely encountered by others analyzing similar data:

  1. multiple adjustment variables deemed clinically important by investigator, some with 7+ categories and low/no cell counts: does it make sense to adjust for these variables in the preliminary analysis?
  2. accounting for variable post-hospitalization response time: can/should we adjust for time to response?
  3. confounding by adjustment variables: relationship between measures of illness severity and clinical measures may make sense.
  4. Adjusting for baseline response

Current thoughts for the first issue:

Multiple adjustment variables with low/no cell counts

  • Recommend that client obtain more subjects in categories with low/no cell counts, as it is impossible to estimate contrasts for these groups
  • Refer to statistical “rule of thumb” of at least 10 observations per fitted variable, and note that categorical variables with k categories (k>2) require fitting k-1 dummy variables.



Posted in Uncategorized | Leave a comment

Causal Modeling Homework 4 (final)

Homework 4 covers the following topics:

  1. Markov equivalence classes (of directed acyclic graphs)
    1. define ‘equivalent graphs’ in terms of d-separations implied by each class of graphs
  2. From a set of d-separations,
    1. determine adjacent pairs of vertices, unshielded colliders, corresponding Markov equivalence class, CPDAG (complete partial directed acyclic graph)
  3. Given a joint distribution P(a,b,c,d,e) that factorizes according to an unknown DAG G* with a set of conditional independence rlationships,
    1. Inferring structure of DAG G*
    2. Assumptions on DAG G* (about inferences)
  4. PC algorithm
    1. Interpreting steps, understanding how the PC algorithm uses significance of associations to remove/direct edges
    2. orienting undirected edges in resulting CPDAG consistent with time-ordering of variables (common-sense causal relations)
  5. Single-world intervention graphs (SWIGs)
    1. intervene on x, read implications
    2. Applying the backdoor formula to adjust for confounding
  6. Intent-to-treat (ITT) effect of treatment assignment (Z) on outcome (Y) given treatment (X)
    1. Verifying IV inequalities (to verify no direct effect of treatment assignment Z on outcome Y, so only effect of treatment X influences Y)
    2. Constructing 3-d polytope for (%Helped, %Hurt, %Always recover, %never recover) compatible with data in Z=0 (control assignment), Z=1 (treatment assignment), and combination of both
      1. (recall: intersection before projection yields a subset compared to projecting THEN intersecting 3-d polytopes in 2-d)
Posted in Uncategorized | Leave a comment

Vagal tone and prosocial behavior

Two articles provide a connection between physiological control over heart rate and the sympathetic/parasympathetic nervous system (vagal tone), empathic traits (empathic accuracy and empathic behaviors), and social power.

First source is entitled, “Social Power Facilitates the Effect of Prosocial Orientation on Empathic Accuracy” (ref: Link to PDF). Briefly, they found that prosocial orientation (measured by respiratory sinus arrhythmia, or one’s ability to change heart rate with breathing) improves empathic accuracy (ability to accurately judge the emotions of a stranger) when the individual is (or perceives to be) of high power. They found that individuals they induced to feel powerful were more capable of gauging a stranger’s emotional state.

The second source is entitled, “Roots and Benefits of Costly Giving: Children Who Are More Altruistic Have Greater Autonomic Flexibility and Less Family Wealth” (ref:Link to PDF). The research indicates that children who forgo self-gain to help other people tend to have greater vagal flexibility and higher subsequent vagal tone compared to children who do not, and this effect is measured early on in child development. They define altruism as “costly helping” and posit that altruism is one of the harder prosocial behaviors to learn. There is a field of study (called polyvagal theory) dedicated to studying how our biology affects our behaviors and perceptions. Polyvagal theory “posits that prosociality is supported by physiological states that foster calm social engagement and inhibit defensive responding (i.e., fight-or-flight behaviors).

More to come.


Posted in Uncategorized | Leave a comment

DSMB reports

Met with Eric today and went over some tables that are generated each study period. Some time-specific sequences:

  • Data collection will stop about 4-6 weeks out from the time the DSMB meeting is to occur (to give the CTC time to collect and clean data and to get reports out to the DSMB committee)
  • Meeting is every 3 months (but they might not enroll enough patients given the 5/2015 cutoff)

Some new concepts:

  • Consort diagram – flowchart describing at how they got at the analysis population
  • TEG – thromboelastography: measurement of efficiency of blood coagulation
  • DRS (outcome) – disability rating score: alternative (secondary) measurement of outcome
  • GOSE (1* outcome) – Glasgow outcome score, extended: measure of brain functioning (1=dead, 8=OK)
  • Key SAE’s:
    • DVT – deep vein thrombosis
    • MI – myocardial infarction
    • PE – pulmonary embolism
    • seizures

Statistical issues:

  • A measurement of progression is from head CT scans. Of particular interest is the progression of blood (clots/presence/size) on the CT scan(s) but it requires surviving to the second (or later) CT. Question is if there’s confounding (in survival, in the treatment effect, or something else) that’s affecting whether or not the second CT is observed, which will change the interpretation of “CT scan blood clot progression” or the like based on how the sample was obtained.
Posted in Uncategorized | Leave a comment

Traumatic brain injury treatment (concepts and vocabulary)

An issue came up with some time checking regarding three events (chemical paralysis, advanced airway placement, and qualifying GCS test) and when they happened relative to when the study kit was opened. This led to a discussion of placement of advanced airways, and what an “advanced” airway really meant.

When a patient is not able to breathe on their own, a medical intervention is used to pump air and/or oxygen to the lungs. This often requires the patient to be chemically paralyzed (to prevent the sputtering/choking response from having a tube shoved down your throat). For ‘definitive’ airways, this often involves having a tube with an inflatable cuff that goes past your vocal cords down into the space right above the lungs (care must be taken to prevent the tube from entering the stomach, which would be a ‘sentinel event’ — one that is likely to cause serious injury or death). A cuff is then inflated so that air can only flow in and out through the tube. This type of airway requires some form of imaging to ensure that the breathing tube gets to the correct spot (and doesn’t pierce the vocal cords or go into the stomach, for instance).

Another form of airway (which they dubbed ‘alternative’), involves the use of a breathing tube that does not go past the vocal cords (i think?). This type of airway cannot use a CO2 sensor (that turns purple from yellow) to indicate its correct placement.

The final airway is what she called a ‘Bag/Valve mask’, which is a mask placed over the patient where air/oxygen is pumped in. This may work for patients who are having trouble getting enough oxygen to their lungs but are still able to work the muscles to breathe — someone who is paralyzed (and/or has serious brain injuries) may not be able to breathe without intervention.

The real issue is that when a person is chemically paralyzed, taking an accurate assessment of their brain functioning (using the Glasgow coma scale, or GCS) is difficult to do. There are three components to the scale (E/V/M) – eyes, verbal, and motor, with max scores of 4/5/6. When a person is chemically paralyzed, their ability to respond to verbal commands (to move their extremities and eyes) is reduced, but it is unclear whether this was due to damage to the brain or simply to the drugs. A GCS score of <=12 is required for enrollment into the clinical study, so an accurate assessment of their GCS score is needed in order to determine eligibility status.
As an aside, if a person is intubated, their verbal score is 1 (since they can’t talk with a tube down their throat!).

Posted in Uncategorized | 1 Comment

Book wish list / review list

I’ve noticed I have a hard time “guiding and directing” my learning, and oftentimes I acquire information and develop intuition through non-intuitive and/or seemingly non-informative activities (how nonintuitive!). I don’t know if this is hindsight bias, but it often was easier for me in the past to learn something or figure out how to do something by not even thinking about it, but just “doing it”, or “reading about it”, or “learning it”, rather than talking it up and either psyching myself out or not putting in the requisite effort necessary to understand something or get good at it.

I have a huge reading list, both for (A) work: vaccine statistics, probability, machine learning, statistics, mathematics; (B) personal: economic theory, behavioral psychology, investing, probability and statistics, game theory and strategy; and (C) poker: web-based instructional videos, poker forums, poker books, and poker calculation tools (works on both computational aspects of SQL-like queries and higher-order problems in poker relating to combinatorics/probability).

I’d like to improve my efficiency by reducing certain time-wasting activities that take up most of my free time: (1) youtube, (2) facebook, (3) random craigslist surfing with no directed purpose of purchasing anything, (4) reddit (not so much anymore), (5) smoking and sitting around, (6) reading message boards/forums without taking notes, (7) watching poker videos/fail videos/fights on youtube.

Figure that it’d be better to choose some interesting subjects to read, and to do a book-review-esque “journal” of the books that I’ve read, either to just piece out interesting quotes, mull over ideas that haven’t fully gelled with me, and to perhaps one day eventually construct my own ideas and philosophies surrounding risk, behavior, and medical science!

Here’s a list of books/tools/papers that I’m either currently reading, or in the process of wanting to buy/bought/not read/halfway read/already read but not reviewed/wrote something on it. Will follow up with reviews as I go through. NOTE: I may need to end up retrospectively organizing the layout, but for now we’re just going to try something new.


Work materials:

  • – LATB, crush live poker subscription
  • – PQL, Odds Oracle


  • Nicholas Nassim Taleb – Antifragile
  • Nicholas Nassim Taleb – Fooled by Randomness
  • Charles Duhigg – The Power of Habit
  • Nate Silver – The Signal and the Noise
  • Richard Thaler – The Winner’s Curse
  • Kelly McGonigal – The Willpower Instinct
  • Edward Chancellor – Devil Take the Hindmost
  • William Poundstone – Fortune’s Formula
  • Daniel Kahneman – Thinking, Fast and Slow
  • Gary Belskey – Why Smart People Make Big Money Mistakes
  • Thomas C. Schelling – Micromotives and Macrobehavior
  • Leonard Mlodinow – Subliminal: How Your Unconscious Mind Rules your Behavior
  • Bruce Hood – The Self Illusion
  • Robert B. Cialdini – Influence: The Psychology of Persuasion

Non-book reading:

  • Alan Agresti – A Survey of Exact Inference for Contingency Tables
  • Roderick J. A. Little – Testing the Equality of Two Independent Binomial Proportions
  • Qin et al. – A Framework for Assessing Immunological Correlates to Protection in Vaccine Trials
  • Plotkin, Gilbert – Nomenclature for Immune Correlates of Protection After Vaccination
  • Plotkin – Correlates of Vaccine-Induced Immunity
  • Gilbert et. al – Evaluating Immune Correlates of Protection
  • Tamara G. Kolda – Tensor Decomposition and Applications
  • Gilbert Strang – Linear Algebra (MIT OpenCourseWare)


Posted in Uncategorized | Leave a comment