An interesting visual depiction of spurious correlation (check it out here) reminded me of my grad school days and the rigor with which I would build hypotheses. Rather than let R, SPSS, or Excel correlate away and then proclaim some amazing finding, I started from the reasons and results I expected to validate with data. The difference is, all too often, that the former approach tells you very little due to endogeneity, spurious results, and the lack of context.
Some organizations–Google is known for this–will say “don’t worry about the why”. Some have referred to this approach as “theory-free“, a nice euphemism to indicate how little long-term value we might find in these correlations. Now, for consumer behavior where Big Data is truly present perhaps this works. But, data points are rarely available for nonprofit analytics in the same way as, say, Target and Wal-Mart have data…although there are new options underway, like David Lawson’s newsci.co.
And, if you talk with a gift officer who’s been disappointed with predictive modeling results, you see a different picture. From that vantage point, the analytics results are frequently devoid of context. The result confirm what we already knew (“these prospects look rich! they live in a nice neighborhood!”) or reflect a pattern we already see (“they gave last year! let’s ask them again!”). Yet, modeling doesn’t typically improve relationships with prospects.
A big culprit: Context. Donor context is critical in building relationships. And, context is quite challenging to incorporate into modeling. The following are real examples of discussions about potential prospects surfaced by a context-free model:
- “Sure, Jane looks promising, but we don’t have a phone number to reach her and no volunteer connection, so how likely is it she’s approachable?”
- “Absolutely, Ed looks great, but did you know he just filed for divorce?”
The solution to this issue isn’t to cast off analytics. It’s to improve it. Start with and add in theory. Guard against spurious results. Don’t elevate an endogenous variable as meaningful. And, most of all, our industry needs resources that can actually add context to results. As a student of philanthropy, I am anxiously awaiting the time when our new science of analytics better delivers on the hype and improves our understanding of donor behaviors, while avoiding endogeneity and spurious results.