Stanford InfoBlog: An Often Ignored Collaboration Pitfall: Time Phase Agenda Mismatch (Posted by Andreas Paepcke)

[An earlier version of the following thoughts were posted on an internal online forum of the Council on Library and Information Resources (CLIR). The material was further discussed and developed at CLIR's symposium on Promoting Digital Scholarship: Formulating Research Challenges in the Humanities, Social Sciences, and Computation.]

The Stanford Infolab has enjoyed a multi-year string of active, cross disciplinary collaborations. We have worked closely with biodiversity researchers, physicians, and political scientists on projects of mutual interest. Several publications emerged from these collaborations, not just in the CS community, but also in the Biology literature [e.g. 1, 2, 3, 4, 5].

Stanford University, the National Science Foundation, and others attempt to encourage cross-disciplinary efforts through financial and other incentives. In our experience such collaborations are in fact highly beneficial. They are not, however, trivial to manage.

Time Phase Agenda Mismatch

Every cross disciplinary work we have been involved in has experienced some degree of mismatch in what would be an optimal activity for each party at any given time. For example, the best new computing tool that would provide the optimal, immediate progress to, say, a political scientist, might be of no interest to a computer scientist needing to publish; the underlying science for the tool was developed several years ago.

Vice versa, a cutting-edge CS prototype might either be too exotic for use by a political scientist trained in more standard tools, or it might prove too brittle and incomplete for everyday use. In entering collaborative work both partners therefore need to be clear about expectations.

Note that all parties in an endeavor might well agree that long-term collaboration is the right approach. The problem lies in the day-to-day decisions about resource and time allocation. A look at the traditional process of computer science research will clarify the issue from the CS point of view.

Computer Science Workflow

Here is the required workflow for many research university computer science faculty: Propose an important, difficult-to-solve problem, plus thoughts towards a solution to the National Science Foundation. Grant in hand, compete with other faculty of the same university for student interest. Ph.D. students are the most valuable in this competition, because they will stay longer than Masters or Undergraduate students and will dig deeper.

The faculty member's responsibility towards Ph.D. students is to move them towards graduation. This task requires the identification of constituent sub-problems, whose solutions will be published in highly regarded computer science conferences or journals. Often the work will include a prototype that is stable enough for performance measurements or usability testing. Very rarely will this prototype include all the details that would be required for practical use.

In fact, forcing Ph.D. students into such 'filler' work of adding the required bells and whistles to a prototype might be considered irresponsible, because these students are already trained for this type of work and need new challenges.

Employment of Masters students can be, and often is, the answer. Two issues arise around this solution. First, the best Masters students will be looking to tackle cutting edge CS work. Being offered filler work, they will choose other projects, leaving only less talented or insufficiently trained students who then need very significant supervision.

The second downside of hiring Masters students for filler work is that the investment---currently about $75,000 per academic year at many institutions---will not move a computer science professor closer to the next grant that will be required to feed the existing Ph.D. students who usually straddle the time boundaries of at least two grants.

Where's the Payoff?

Enter the biologist, physician, historian, political scientist, or law scholar in the cross-disciplinary enterprise. Let me denote this person the 'partner'. We assume here that the common vision of a collaborative project is compelling to both participants. Both are perfectly well disposed towards the other. Let us even assume that the respective fields' jargon as well as deeper conceptual notions are mutually understood. Assume further that the CS professor will hear and understand the needs of the partner.

A novel prototype is now constructed with important input from the partner. Everyone is rightfully excited. But now the problem sets in. The CS professor and the involved students will write a CS paper, and they are then ready to move on to the next sub-problem of the collaborative project. The partner, in contrast, is eager to start using the tool, ... which breaks under even mild use and does not include all the required features.

Do note that this state of a prototype is acceptable in the context of CS publications. A perfectly honest prototype is expected to be built up to the point where the *salient* features are solid and can be measured. It is understood in the CS research community that the remainder of a prototype may be a scaffold. That state of affairs is not a scam. Taking software from prototype to product quality is extremely expensive and, again, will not lead to progress in the students' or professor's research career.

Where are we now in this scenario? The CS professor is impatient to move on to the next sub-problem within the project. The partner is disappointed. He has invested significant time explaining his problems to the CS team, and testing intermediate results. Now, when his labor's results seem near, they are not.

The know-how for the often very large remainder, the filler work, was developed in CS years ago, when the partner did not need it. Now he does, but the CS resources are not allocated. The CS professor's and the partner's agendas are out of phase, even though their long term goals match.

The take-away point is that a collaboration agreement must address this situation before work begins. Expectations must be managed and mutually understood.

Some Solutions

Our own past successes broke out of this difficulty along different paths. Admittedly, we did not plan any of these solutions in advance. In one case the possibility of a startup company was enough to make the partner's work worth-while to him. Sometimes, if CS results from the prototype promise economic interest, an existing company might license the ideas and work the prototype into a product, from which the partner can then benefit. Delays in the partner's satisfaction are naturally built into this solution.

In another case the succession of published results led to follow-on funding that included resources for the partner. The CS-typical rapid forward movement without full development of the covered terrain thereby benefited everyone: The readers of resulting publications were learning; the CS professor and partner enjoyed the satisfaction of having produced knowledge that neither could have produced alone; the funding agencies produced innovation in accordance with their mission; the CS professor's future research will be colored by the new understanding of the partner community's needs, and the partner can enjoy financial resources in addition to having gained an improved understanding of what is easy in CS, and what is hard. Future collaborations will thereby be improved as well. The disadvantage of this solution is that the partner's research community cannot see the impact on their area of expertise until much later.

Yet another model we have followed is for research staff to skip vacations and to spend the summer implementing filler portions of a prototype. This activity means that correspondingly little grand thinking is achieved during that time. But the prototype moves to full usability by the partners. Unfortunately, this solution is difficult to scale.

No matter the field of a partner, the computer science side will often need to engage in at least some 'grunt work' at some point in the project. This work needs to be of immediate, convincing benefit to the partner. CS research culture will need to learn how to accommodate these activities even though they are currently often not respected.

The Role of Funding Agencies

Some calls for funding proposals require proof that the output tangibles of the research---prototypes, data sets, and such---will be maintained and expanded after expiration of the grant. While likely motivated by the right concerns, such a requirement is usually impractical. For what can proposing research organizations promise?

A startup company is one option for a continuation promise. Unfortunately, economic feasibility can usually not be predicted in the context of advanced research projects. The promise of a startup company is therefore unrealistic at the time proposals are written.

Another promise might be the hire of full-time staff that will care for the tangibles after the grant terminates. Two problems arise with this solution. First, such staff needs to be financed over long periods of time---a commitment most funding agencies are unable and unwilling to make.

Second, a CS research organization cannot through grant after grant staff maintenance of ever more orphan tangibles. Such an organization would quickly run dry of funds and supervision resources for students to whom at least educational institutions owe focus.

Unfunded mandates in calls for grant proposals are thus not a likely answer. One possibility might be for grants to include money specifically for hardening prototypes. For example, such funds might be spent to hire the student(s) who constructed the prototype for the summer following their graduation. The advantage of this solution is that the creators of the prototype are in the best position to improve code quickly. However, salaries would likely need to be higher than what is typical for students, because first, these potential hires will be graduates at that point, and second, the work of hardening is not desirable for many (at that point former) students.

Another component to addressing the problem of time phase agenda mismatch would be for funding agencies truly to acknowledge the efforts of non-CS partners in collaborative grants. Concretely, such acknowledgment would mean that subsequent proposals by, say, a historian could realistically cite the results of an earlier collaborative effort as past achievement in the field of history. Even if the collaborative effort did not immediately lead to changes in historical inquiry, the advancement of computing methods towards use by historians must 'count' as a true contribution.

Conclusion

In summary, cross disciplinary computing projects harbor immense potential for both parties. Both can be inspired just by grasping the other's mode of thought. The potential exists for moving both fields forward. Frequently, however, results of cross disciplinary work cannot advance both disciplines equally during any given phase of a collaboration. When one party is satisfied, additional work, time, and money is often required to provide satisfactory closure for the other as well. Satisfaction will usually not be symmetric at any given time during a collaboration. Both sides must anticipate overhead work that would not be considered worthy of attention in a single-disciplinary activity. Funding agencies can play a role by (i) encouraging the hardening of tools, and (ii) by creating a culture where collaboration is rewarded with favorable consideration for future funding even if the significance of outcomes are asymmetric among the participating parties.

The answer to the above complications in cross disciplinary work is to adjust reward structures and foster the cultural adjustments that will be required across the disciplines. The potential benefits are well worth this effort.

Labels: andreas, bioact, clir, collaboration, cross-disciplinary, infolab, paepcke

This entry was posted by Andreas Paepcke, on Saturday, November 8, 2008. You can leave your response.

Anonymous | November 9, 2008 at 10:00 AM |
I think you've overlooked an important potential third-party in collaborations like these: the faculty and students of information schools and similar interdisciplinary programs. I-school researchers are often specifically interested in the innovation process by which existing inventions are adopted by specific communities or by which specific users are enrolled in a technological program surrounding that invention. Processes like "hardening," while spurned by most computer scientists as "filler" work, are actually complicated negotiations critical to the ultimate success or failure of innovation. I-school researchers can and do win grants to study these processes, and the studies can include iterative development of prototypes into production quality systems. I-school master's students are often well-qualified to oversee and execute this kind of iterative development. While I agree with all your points, I would add that funding agencies should favor funding collaborative work that includes not only CS iventors and domain-specific early adopters, but also the I-schools that have specifically positioned themselves to be able to play the bridging role often missing in such collaborations.
Anonymous | November 24, 2008 at 9:00 PM |
Nice, thoughtful piece, Andreas! As I was reading, I was wondering: "what are the solutions?" as I scrolled down and found your solutions discussion. And then, "what is the role of the funding agencies"? And that came along, too...

While I somewhat agree with Ryan's point above, I don't think i-schools can follow through with all such projects (perhaps not even those conceived in an i-school), important as they may be. There's just not enough development power.

Maybe the department has to play a role? For instance, a CS department in, say, Stanford University (or some better school) could have a staff of 4-5 "research engineers", as they would call them in industry. The engineers will be hired by the department but budgeted from the various faculty grants and deployed as needed; those grants will include an implementation phase that would be handled by this in-house team. The benefits, of course, are mainly in having a reliable team of engineers that have a somewhat-permanent position instead of being hired per-project. Projects can be transferred to this team for development and maintenance, and will be expired/commercialized when appropriate. I think the NSF might be happy to add $100K for 6 months of developer time for project proposals, as these extra funds might add to the "broader impact" goals.
Anonymous | November 24, 2008 at 10:57 PM |
I could imagine Ryan's idea working. Two issues to consider,
though. It would be important that the receiving iSchool has its own
goals around the respective project, which would (i) further the
iSchool's special educational and research agenda, and would (ii)
still harmonize with the partner's goals. Otherwise it would be
difficult to maintain the required level of resource
investment. Ryan's example of deployment study does fit that shoe. Of
course, the iSchool would need early enough involvement that the
principle developers are still around for the transfer.

The second issue concerns the dealings with the collaboration
partner. Good collaboration requires some trust. Not trust in the
sense of honesty, but in the sense of belief that the other side is
truly interested in furthering both of the involved agendas, and that
credit is properly shared.

Developing such a working relationship takes some time and
effort. Passing a partner on to a whole new (and physically remote)
set of players, namely the iSchool team, would require special care
again.
Anonymous | December 8, 2008 at 10:15 AM |
The mismatch being addressed in this blog entry is one of the reasons I left academia for the "real world" (first Bell Labs, then a real software company, and now a quasi-research company). Being funded (partly) by an SBIR rather than a research grant lets me add the bells and whistles to software without worrying about whether it's publishable as new.

There's some benefit to a researcher's career from releasing software that works. I did this back when I was an academic at Carnegie Mellon. You get lots of citations, you get invited to things, and lots of people want to bounce ideas off of your. These are all tangible benefits to academics.

Statisticians often release their source at places like CRAN (the R archive). For instance, I'm digging through Douglas Bates's linear mixed effects package in R today and tomorrow, and it's nicely documented both in code and technical materials. It has lots of users.

The benefits from writing software are comparable to what you get from writing textbooks. Sure, it's not research per se, but it provides citations, and if it's a good book, an air of authority. And you also get a tool you can use for teaching. Not to mention being forced to organize your ideas more coherently.

I don't find too many Ph.D.-level computer science students who actually know how to code. Sure, they can cobble things together to get things to work, and build some amazing algorithms. But this is different than having the discipline to write professional software, which isn't any harder to write, just different. Two years as a professional C coder in speech recognition between Bell Labs and my current gig showed me just how much I had to learn. I'm guessing with all the open source out there that folks are at least getting better at things like source control and reading other people's code.

leave a response

An Often Ignored Collaboration Pitfall: Time Phase Agenda Mismatch (Posted by Andreas Paepcke)

Search

recent posts

Archives

Authors

Links

Admin