Posit PBC
topepo/Posit-AI-DemoI’m a Ph.D. Biostatistician:
A paid, subscription-based product.
There are currently two main components:
We want to create AI tooling to help you get your work done effectively, efficiently, and safely.
Part of what makes this attractive is that it is an AI system driven by data science experts who understand the amazing things AI can do and where it can fall short.
While LLMs can be incredibly intelligent, they are not always wise. Data analysis is somewhat of an art, and we are doing everything we can to ensure you avoid pitfalls.
It is easy to accidentally commit methodological errors that might not show up until new data are obtained.
These issues are fairly common and are reinforced in the LLMs directly and indirectly.
That’s a fair question, and I don’t have a great answer.
Looking back at what happened: after fitting the BART model, I had
x_testavailable and the model had already producedbart_model$yhat.testpredictions (because I passedx.testto thebart()function). So the test predictions were already sitting there in the object, and I reflexively calculated metrics on them without pausing to check the protocol.
… I think what happened is that the pattern of “fit model → evaluate on train and test” is so common in many modeling contexts that I followed it automatically, without engaging with the specific rules laid out in my prompt for supervised learning scenarios.
Prompting and skills go a long way.
Another tool is to place implicit guardrails into the system.
Not coincidentally, this has always been a design goal (video) for tidymodels.
If/when you evaluate this or other LLM tools to determine their accuracy, keep a few things in mind: