episode 210
Getting Specific with AI via Categories: Chat, Headless, and Embedded
episode 210
Getting Specific with AI via Categories: Chat, Headless, and Embedded
Everyone’s talking about AI. Almost no one’s getting specific.
And that’s the problem.
Rob and Justin break down the three AI categories that matter: chat agents you talk to, embedded assistants baked into your workflows, and headless systems running quietly in the background doing the grunt work nobody wants to do.
This isn’t theory. This is how AI is finally working for real companies.
They walk through Vendor Bot, Scheduler Bot, and Budget Bot — small, focused tools that do one thing exceptionally well. Because here’s what keeps happening: the “one bot to rule them all” projects collapse under their own weight every single time.
But stack a few narrow, reliable bots together? That’s when things get interesting.
If you’ve been wondering why AI works for some teams and stalls for others, this episode has your answer. Spoiler: it’s not about going bigger. It’s about getting ruthlessly specific.
The “we’re all in on AI” era is over.
The “we know where it fits” era just began.
If you like the show, be sure to leave us a review or share it on your platform of choice.
Episode Transcript
Anouncer: Welcome to Raw Data with Rob Collie, real talk about AI and data for business impact, and now CEO and founder of P3 Adaptive, your host Rob Collie.
Rob Collie: And welcome back to the podcast there, Justin.
Justin Mannhard...: Thank you.
Rob Collie: I was going to make fun of the Starbucks CEO for saying that they were going all in on AI, because the [00:00:30] first headline I saw about this was clearly making fun of them, and it seemed like an easy thing to pile on. Here we go, another big time CEO saying all in on AI or whatever, without even really knowing what they're talking about.
Justin Mannhard...: We've been conditioned to feel that way though, by the way, because of how multiple CEOs pursuing this have acted.
Rob Collie: Oh, yeah. This would be a time-honored tradition at this point, so we're expecting it. And we're just like, "oh, here's another delicious one [00:01:00] coming down the pipe."
But then it turns out I went and read up on what he actually said, and it all just sounds entirely reasonable. It's like, oh, dammit, we don't get to make fun of him. I mean things like-
Justin Mannhard...: Go on.
Rob Collie: Being able to just talk to your mobile app, talk to your phone and say, "Hey, order my normal Starbucks." I know that in theory, Siri is supposed to do that for you on your iPhone or whatever, but I never trust that.
But if it was the Starbucks app that I was talking to, why not? Again, use AI as the [00:01:30] conversational shock absorber that interprets the human intent. That's one of the things that it is absolutely best at. And by the way, that's exactly in the previous episode. What Bill Krolicki's Vendor Bot does is it uses AI, that's not voice in that case, but it's using AI.
The only part of that solution, the Vendor Bot solution, that's AI is the interpretation of the email that comes back from the vendor. So, it's another example of that. Or the other things mentioned in the articles, instead of having manuals [00:02:00] that you have to break out when the espresso machine is breaking, you just have this AI agent you pull up on a tablet, and it interviews you about what's going on.
Again, entirely sensible, nothing to make fun of here. So I wonder, are we exiting the golden era of hotshot CEOs talking about AI and completely sticking their foot in their mouth? Are we entering a new era of credibility? I don't know.
Justin Mannhard...: It has echoes of when [00:02:30] companies would stand up and be like, "We're all in on big data and we're going to make big data work for us." And big data was this big easy term to say that we all thought we knew what it meant, and we were all wrong all at the same time.
And some of that starting to happen with AI, we just say, "We're all in on an AI," and it's like, "Oh, of course." But then you hear about the use cases and you're like, hmm, okay.
Rob Collie: On one end of the spectrum, the one we've made fun of over and over and over again is the Fiverr CEO, telling the whole company, " [00:03:00] Everyone needs to go figure this out, or you're doomed." No plan at all. No specificity, completely abdicating his responsibility as a leader, bad, bad, bad, bad, bad.
And I was really expecting this Starbucks thing to be an equal parts abdication and not making any sense. It was neither. It's very specific.
Justin Mannhard...: Which is a key.
Rob Collie: The specificity is a theme that we've been hammering here. Again, the cynical gallows humor part of me was disappointed. This wasn't one of those cases, but I guess in [00:03:30] general, this is a good thing. We want this kind of progress.
Justin Mannhard...: Well, there's a lot of CEOs, a lot of companies. We might get another crack at this one.
Rob Collie: Yeah, I mean, I don't think we've quite retired this bit yet. Just give us one more taste, folks, one more bite for a moment. Circling back, speaking of last week's episode, and Bill-
Justin Mannhard...: Great chat.
Rob Collie: Yeah, so much fun. I'm only halfway through my re-listen to that one, but it feels really good. Even you and I re-listen to these episodes, not out of vanity, but out [00:04:00] of we miss things and/or if we don't miss them, we forget about them in the course of the week. And it's like, oh, I need to be taking notes.
There are things that I end up cycling back to and making a point of here at P3 behind the scenes as a result of re-listening to the podcast where I as a participant, which is not what people would expect necessarily. I wonder what percentage of podcast hosts do to re-listen to their own episodes. I bet it's not very high.
Justin Mannhard...: And I think this applies to actors [00:04:30] or musicians, people that put their work down on some sort of stored media. There's two main archetypes, the one that never engages with their work after they've done their job. But I just think in this format we're always exploring ideas or we're having moments of epiphany or whatever. It's actually really useful to go back and listen.
Rob Collie: He did email us, Bill, today after listening to it and told us he forgot to share [00:05:00] something really kind of neat, which is the impact of Vendor Bot. Before Vendor Bot, and again, if you haven't listened to the episode, you really should go back and listen to it, just such a great grounding example of AI usage and impact.
Before Vendor Bot, basically like 40% of the time that they were about to go manufacture something, they didn't have all the materials. The vendors hadn't come through on time, so they had a 40% not on time [00:05:30] rate, which in the real world, I mean, the real world's chaotic and all it takes is one vendor to be late and it's late. They all have to be on time. So, they've gone from 40% late rate, or at least one supply wasn't there, they've cut it in half.
Justin Mannhard...: In a pretty short amount of time.
Rob Collie: Yeah, that sure seems worth it when you consider the impact to the schedule and the efficiency and just how much it costs to not be able to manufacture the thing you're supposed to manufacture [00:06:00] that day. This also hides, I think the other benefit, which is so even though they're on time, the late rate is cut in half, that's huge. But their visibility into when things are going to be late is also increased so they can react and advance.
So, even for the remaining 20% of the time that things are late, their efficiency in those scenarios is also considerably higher. They've done more than erase half of the problem with this one very simple application, [00:06:30] so way to go Bill and team at International Packaging.
Justin Mannhard...: It's fun to see that what for me was, I didn't even realize it at the time, but effectively a hypothesis about how impact would occur. You compare the idea of like, oh, we'll just put AI on top of all of our data or information... Magic will happen, and how little impact is falling out of those things. Whereas you have this really hyper-focused, specific application of AI, small surface area, huge impact.
Rob Collie: [00:07:00] And then wash, rinse, repeat, right?
Justin Mannhard...: Yeah.
Rob Collie: Yes. There are many, many such places to apply it.
Justin Mannhard...: Exactly.
Rob Collie: And in Bill's example, no one lost a job.
Justin Mannhard...: Nope.
Rob Collie: In fact, everyone's jobs became more clear-headed because they're not having to deal with this junk and their bottom line improved. Where's the downside here? So, under that same vibe of reducing the practice, getting into examples and not just the general, we're going to do a little bit of both I think in this episode, sort of the general lens on AI, how it operates, [00:07:30] what it does, but also some specific examples.
Justin Mannhard...: Some theory and some practice.
Rob Collie: So under that heading of theory and practice, I've been decomposing things into AI solutions into three different types of user experiences, three different types of front ends. So one is chat clients, just like ChatGPT.com, except having purpose-built focused chat clients that are intelligent about helping you solve certain problems and doing certain things for you. They're pre-trained, and given all the context and everything, [00:08:00] a chat client agent, like a chat agent, right?
Speaker 1: Yep.
Rob Collie: Some reasonably large percentage of AI scenarios are going to be like that. Second one is embedded, so embedded into other applications. In a chat agent, you're going to the chat agent and that is the application that you're using. As opposed to, let's say you're using Salesforce and there's something in the sidebar, what's in the sidebar of that Salesforce form?
[00:08:30] It might be a chat agent that's embedded there, but even if it is a chat agent embedded there, it's going to have access to whatever it is you're currently looking at. The context of what you're looking at in the other application, like Salesforce, is going to be made available to that embedded experience. But it might not be a chat agent. It might also be a button that you click that triggers something else to happen.
Justin Mannhard...: Like little reminders like, "Hey, I noticed these fields [00:09:00] need to get filled in. Would you like that to happen?"
Rob Collie: Or generate email to this contact that then takes a look at the history of your interaction with them, and it doesn't have to be necessarily chat. But the point is purpose-built embedded into something else, very broad category.
And the third category, the so-called headless agents, things that are running in the background without any direct user interface that you're seeing in front of you. But ultimately, the way you see these things [00:09:30] is when they surface information to you in some other way, they might notify you. You might get a message in Teams or a message in Slack. It might just be making updates.
I don't know. Really simple, simple, simple examples of things we already see, like people trying to build in summarizing what your day is going to look like ahead of time. We've been seeing stuff like that forever.
Justin Mannhard...: Giving you a summary of all your DM channels at work.
Rob Collie: Which we've found some of these are really not very useful yet. The custom-built ones that are aware of your business and [00:10:00] everything and are able to see across stuff will probably do a much better job than the Slack summary feature is more worthy of chortling, it's more of an entertainer, the things that decides are significant. Like the summary it gave recently where it's like, "Yeah, you guys had a conversation and decided to update this one emoji in Slack so that it was compatible with dark mode." It's just so silly.
Justin Mannhard...: It's pretty significant.
Rob Collie: But headless agents, they cover a lot of ground.
Justin Mannhard...: Sure.
Rob Collie: And we've been [00:10:30] having an uptick in conversations with clients about AI use cases lately. How are you seeing things break down across those categories?
Justin Mannhard...: What's been really exciting for me is, apart from the uptick in conversations with clients about AI and specifically conversations that are getting more specific, I've been having conversations with people for a long time about AI in general, but to get more specific on ideas has been [00:11:00] really energizing for me.
Things are starting to map really clearly into those types of categories, which I think for leaders, these types of framework help you think about the problem. Because when you look at a problem in your business and you say, "Oh, can we use AI for that?" In my head, I have this picture of somebody holding two ends of an extension cord and one says AI and one says problem. And you're just like, "What does that mean? What do I do here?" And so when you start talking to problems in a specific way, [00:11:30] it would make sense to try and solve this problem in this way for these reasons because of what you're trying to do.
A chat-based agent I think is the easiest thing for all of us to wrap our heads around, like ChatGPT has been around for a long time now. Compared to how fast this has all been moving. We all know what it's like to get in and talk to one of these things. But what chat GPT cannot do is understand your business in specific context, specific process rules, specific tasks [00:12:00] it should do, specific data it should access, and so we're talking about and now actually building these types of things on our own AI platform, P3 AI.
And so an example of this we're working on with a client now is something we just affectionately call Scheduler Bot. Its job is to... You interact with that as a user. You come in and you ask the questions about the production schedule at this company, and so it can go look at a Power BI model, it can look at the data that's happening with the schedule, [00:12:30] and then come back and say, "Hey, here's problems with the schedule and here's solutions you might consider."
Well, how does it come up with those solutions? It's grounded in some external memory, or this is the idea that we're going on next, where we have the business rules by which you could resolve some of those problems. You can't mess with these clients. These crews can or cannot go in these different directions.
And so what I think is really important about these types of examples [00:13:00] is this is not an all-purpose chat agent for this entire company. It's being built to have a very specific role, and that's what's allowing us to get really clear on how we set it up and really clear on how it should behave and to create a really reliable user experience. And so what's interesting about this idea, and I was thinking about some of this in real time, is you could almost see a world where this flips to headless [00:13:30] in a way too.
Because the value in the AI is its ability to reason over the data and the rules and help suggest solutions much faster than a team could do manually. So it's like, do you want to prompt this thing to do that or do you want it always doing that and delivering like a morning report.
Rob Collie: That makes sense. If I launch the Scheduler Bot chat agent, guess what the first thing I'm going to say to it is almost every time?
Justin Mannhard...: What are my [00:14:00] scheduling problems?
Rob Collie: Either, "Tell me about the scheduling problems that we've got right now," or, "Hey, let's make today's schedule or tomorrow's schedule," whatever. Those are going to be very predictable first interactions, And so if that first interaction is so predictable, why wait, why shouldn't I show up in the morning or whatever and have a digest already produced for me?
It's like, okay, here's today's problems and here's some potential solutions [00:14:30] and maybe even multiple choice, "Do you want to run with solution one, two or three, or do we need to talk about generating a fourth?" And then you're into a chat experience.
Justin Mannhard...: Yeah, and that's an interesting hybrid idea right there where you could have a very similar configuration of AI tooling, doing that headless part, and then also having the chat experience where, "I want to talk to you more about this particular job or order and [00:15:00] find some solutions there."
Rob Collie: But even there with the Faucets First philosophy, starting with the chat agent is the right way.
Justin Mannhard...: For sure.
Rob Collie: Just to get it working, and then you can take a step back and go, "Okay, now how do we improve this further?" Let's take the necessity of that first prompt that you give it. Let's cut to the chase. You're going to be giving it that prompt every day, so why don't we just give it that prompt?
Justin Mannhard...: It was like what we always wanted from alerts and Power BI. It's like once I realized I don't want to have to go in and [00:15:30] look at the dashboard, just tell me when there's a problem.
Rob Collie: That's actually really interesting, an AI-powered headless agent that's looking for things. We're talking about the chat with data experience. You have a chat agent that you can talk to about your Power BI models.
Well, again, you can take the headless version of that. Okay, just let me know what kind of anomalies are out there, and it'll need some guidance. The headless agent will need some guidance on what kind of anomalies to look for. We don't want it just out boiling the ocean [00:16:00] from scratch every single time it kicks off.
There's certain things that we want to know about. Once it knows that, then it can issue the initial queries to find anomalies like that, but then it can also intelligently drill down to provide more context on what it finds. You could show up in the morning with that as your alert. This thing has detected something and investigated it.
Justin Mannhard...: I'm finding in the calls I'm having with clients where a chat-based agent is a good idea, [00:16:30] and these are all conversations with different clients, different people that don't know each other. There's this natural tendency to describe them as topic bot. So even with our episode with Bill, Vendor Bot, Budget Bot, Scheduler Bot, Inventory Bot.
If you're not describing them that narrowly, that's maybe a clue to really start segmenting the use cases down. Because if you want high reliability, being able to ensure [00:17:00] you're putting a very clear task, like you can have an agent that's maybe general purpose just to help you explore your data. But if you want to help high use type scenario, that's a good indicator for me that we've got a problem that we can really attack and solve.
Rob Collie: This is such a strong parallel with something we've said for years about dashboards, which is don't try to have a dashboard that is the sales report, [00:17:30] the noun, the thing that's meant to answer all questions, and as a result misses the whole point that you have multiple workflows. And this thing by trying to be one-stop shop fits none of your workflows. None of the things that you actually need to actually do are well-supported by it.
So, we used to make the joke that some reasonable percentage of the dashboards you build, you should be able to put the Doofenshmirtz suffix -inator on the end, [00:18:00] or -izer, the blah blah blah optimizer or the schedulizer, the schedulinator, the budgetinator, and it's the same thing. It's the same theme except that in AI, the importance of this is cranked all the way up. You can get away with it. You can get away with being general purpose in dashboards.
You can get away with general purpose in the BI field. Even though you shouldn't, you really shouldn't. You can still get away with it.
Justin Mannhard...: You can, yeah.
Rob Collie: But you can't cheat in [00:18:30] AI. It's got to be something bot, something -inator, something -izer. It's got to be really tuned to that particular workflow and I love it. I love things that force that kind of discipline. If there's a kind of discipline that we all should have, but it's not enforced by the marketplace. We much prefer it to be enforced. I love this.
Justin Mannhard...: So we did have a follow-up chat with Bill about budgetinator. Again, [00:19:00] we had the recipe budget bot, budgetinator. We had a very specific thing, but when you get into the details of the idea, you realize we need this AI and a chat-based experience was the right answer for this solution. So a user could have a chat experience for working through budgets, but you need that AI very grounded in a clear process that it's guiding the user through.
So you could fire up Claude desktop or ChatGPT right now and say like, "Hey, [00:19:30] I got to set my budget. Help me do that," and they'll be, "Oh, sure, Rob, here's some questions I have for you," and dah, dah, dah, dah, dah. But if we want to do that consistently at our company, no, no, no, no. We need the AI to be aware of this is the process you got to take through.
That's that concept of external context that you need to feed it maybe some databases so it understands like, "Oh, Rob left off at step three. Okay, I'm at step three. Okay, great, Rob, this is where we got to pick up the conversation," and it just creates the clarity of here's [00:20:00] all the things we can do to make this work.
Rob Collie: Yeah, it's like the interviewer pattern. This came up in the AI class I'm taking through Johns Hopkins, the sixteen-week course that I've been on. It's actually gotten really interesting lately. But one example that the professor was sharing was having an AI chat agent, verbal chat agent, voice enabled, that helps soldiers perform first aid. Most soldiers, he had some statistics like nine out of 10 times first aid [00:20:30] is being applied by a soldier who isn't a medic, and so they have a flow chart of what first aid should be like.
This is well-established. It wasn't built for this AI scenario. They've had this for a long time, but it's a very complicated flow chart. It's not like people are running around with this flowchart in their head. In a way, what defines a medic is they've spent enough time with the flowchart to internalize it.
Justin Mannhard...: Yeah, they're an expert in the flowchart.
Rob Collie: I mean, of course they've had [00:21:00] some practice in applying each of the treatments that are the exits of the flowchart, but they also have to know the flowchart. And so, the solution that the professor was demonstrating was taking that flowchart and turning it into a graph database so that the system knows unambiguously what the flowchart is. There's almost like an arrow, like a token.
If you think of it as a game board, like a board game from moment, your piece is on one of these nodes in the flowchart, this is where you are, [00:21:30] and if you're here, what's the question we need to answer? For example, is the patient conscious.
In this situation, the context that's fed to the LLM is very, very specific. It's like, look, our only job here is to ask the person who on the other end of the line is the patient conscious, and based on their response, determine which branch to follow. That's where the LLM comes in is it asks the question and then interprets the response [00:22:00] as either yes or no or ambiguous.
And if it's ambiguous, it circles back and says, "Hey, hey, hey, calm down. Slow down. Tell me are their eyes open?" Until it gets the answer, and then when it moves to the next state, it's grounded again in different ways. Now, it's got a different thing that it's trying to find out. It knows where it is in the process, but doesn't need to be carrying all this information. It's got a very, very, very specific task every single time. Its task is different based on where you are on the flowchart.
[00:22:30] So, that same thing in the budget case, there's an interview process and you can imagine what the budget interview process looks like in a flowchart. By the way, in the Johns Hopkins example, it was really neat that it was constrained. In his presentation, he was saying, the thing asked him, "Is the patient conscious," and he replies, "Which of the Backstreet Boys was most influential in their post-Backstreet Boys career?"
Justin Mannhard...: Of course, yeah.
Rob Collie: And it comes back and says, "Hey, no, no, [00:23:00] I'm not here for that. That's not my point. Is the patient conscious?" It refuses to go off-topic, it knows its mission, it understood the assignment, as the kids say.
Any complex process that can be handled via interview, especially if there's a flowchart process, it's a really fascinating thing. Now, of course, in the first aid example, all that really matters is the endpoint of the flowchart. In the end, you're going to apply a particular treatment.
Whereas [00:23:30] in the case of the budget bot, your answers along the way are being recorded in a separate database because the entirety of the conversation like every answer to every question isn't just getting you to the next question. It's oftentimes like, "No, we should be 10% higher this year," and so that answer is important. So, the path behind you is also significant, whereas you can throw it away, and in the end you're applying a tourniquet somewhere, so many processes.
Justin Mannhard...: And what's cool about these types [00:24:00] of use cases is because I continue to find lots of value in getting very specific, so in the case of scheduler bot or budget bot, you realize there's these routine things that we want the AI to do.
For example, many users when they're planning their budget will want to say something like, "Well, let's take last year and just adjust it up for inflation." That's an example where, hey, maybe we won't do this in the first iteration because Faucets First, but down the road we say, "Well, let's actually add that as a tool [00:24:30] in the MCP where it knows how to do that reliably and very consistently again and again and again over time."
Or, "I want to spread for seasonality," or whatever the case might be, and you just get a very natural user experience to just be interviewed and answer very naturally. And then the finance team has basically baked in the rules, "And here's how you do that," when you commit it to the database behind the scenes.
Rob Collie: That really clearly [00:25:00] emerging for us as well. I think this is something that you might be teeing up here is that it depends on how you want to look at it. You can say it two ways. You can say one way is top down. The other way is bottom up.
You can take the budget case and say, "Hey, budget's the goal," but it turns out that it decomposes into we need individual agents that are really good at adjusting for inflation. We need an agent that's really good at asking questions of the user or whatever, that understands. [00:25:30] And then you could build each of those as an independent chat agent. It wouldn't be very useful on its own, but you can test it and then you assemble the budget agent out of all of those other agents. That would be a top-down approach to things. I find that to be sort of advanced mode.
Justin Mannhard...: Yeah, for sure.
Rob Collie: I think we're all going to be moving towards that over time, but I think in the short term we're going to be going more bottom [00:26:00] up. Best example I've got of this is something we've talked about here internally, Kellan Danielson here at our company, president and COO, he is systems and optimization. He sees the matrix like no one else does, and we half jokingly made an agent here called Kellan's Brain that had access to all the sorts of data sources that Kellan-
Justin Mannhard...: Literally everything.
Rob Collie: ... everything Kellan ever touches, it has access to all those things, and the idea was we were going to start training all [00:26:30] kinds of rules into it as well and operating procedures and contexts so that it knew, again, this is like the boil the ocean approach. Now, we were being specific and that we weren't attacking the whole company, but we were building something help Kellan with his day-to-day job. It turned out even that was too much.
Justin Mannhard...: Too much. It got confused, yeah.
Rob Collie: Too high level. So, we've moved away from that for the moment and instead we're building a handful of specific workflow [00:27:00] targeted one Kellan-style workflow at a time. And at first we were thinking instead of one, we need six. That might be true. That is true.
It's just that when we have the six, we then might be able to, and we're definitely going to try, build a seventh one that is Kellan's brain that has access to all six of these others as subcontractors. In the same way that you can build a chat agent and then [00:27:30] think about are there headless applications of it? You can build individual chat agents and then say, "Now, is there a possibility that we can make this more like one-stop shop for people?" The goal of I just walk up to a single chat agent at my company that can do anything was the original boil the ocean image that I think most people had.
Justin Mannhard...: Yeah, because we saw ChatGPT and we could ask it about [00:28:00] training for a marathon or gardening or learning about physics.
Rob Collie: It just turns out that because none of my business is trained into it, it's just not possible. But I think we will at some point be able to increasingly realize that original goal, but the way to get there is bottom up. If you end up with a menu of 45 specialists, and it just happens that you've bookmarked each of the [00:28:30] 45 specialists, you've got a really long browser bookmark bar, like which specialist do I need to talk to right now? Having a 46th one that knew where the other 45 were and based on your question, which one or ones to invoke in the background, I think that's where we're going to be headed.
Justin Mannhard...: I think so. Without some sort of breakthrough that we can't predict, making the scope small creates a lot of reliability and a lot of [00:29:00] value. But we want things that can do different actions or infer different types of analysis, or in the example with Kellan Brains, you saw pretty quickly how just it got tired and confused under the weight of all the context it was trying to carry and-
Rob Collie: Yeah, poor thing.
Justin Mannhard...: Poor thing, poor Kellan's Brain. The call I had yesterday with another client was actually really interesting. You've talked about how the simple idea that we've got structured data, then we've got [00:29:30] all this unstructured data, and AI now brings the unstructured data in play and it's a headless agent.
It's more just deploying that really in kind like an ETL capacity, so this company get documents that are PDF documents from their, I believe, it's like their resellers. And on these documents is records of their products being sold, but also it's a situation where a client, if you went to a store, you'd buy some of their product, you buy somebody else's product, [00:30:00] and so they can do this really interesting product basket analysis of like, "Oh, when Rob buys my water bottle, he also tends to buy that other brand's other thingamajig."
And they've never been able to do this analysis, but AI is able to read this PDF and say, "Oh, this is the information on it, and I need to go put it here in this database so someone can actually analyze this in aggregate." And they just want this working all the time. [00:30:30] They've been sitting on these mountains of documents forever and ever and basically could do next to nothing with it.
Rob Collie: Sorry, what's in the PDFs?
Justin Mannhard...: They're like invoices, so think of it like a record of sale.
Rob Collie: Oh, okay, okay. The PDFs are not of a standard format, like each reseller might have a different PDF format.
Justin Mannhard...: Yeah, they're not in a super consistent format. There's a ton of them, and so AI's been able to figure this out.
Rob Collie: Does it figure it out by examining the structure of the PDF? [00:31:00] I'm assuming it does, as opposed to treating the PDF as an image.
Justin Mannhard...: I believe, yes, the structure of the document itself.
Rob Collie: Because I have found that the LLM's ability to digest images, it's great, and then it just falls off. The thing I was doing was I was taking screenshots of the roster's page of our fantasy football league, 12 rosters in perfect table form. It's very well formatted. The image is as clear as it can be.
If I [00:31:30] take a screenshot of just one person's roster, like a 15 row table, one column, essentially, it chews that up and tells me, "Yeah, this is who the players are on their team and what positions they're in," and blah, blah, blah, blah, blah. No problem. But if I feed it all 12 of them at once, the hallucination was amazing.
Justin Mannhard...: I remember trying to work on this for you a long time ago.
Rob Collie: But this is something I've just done recently. I was trying to automate some of my personal fantasy football workflows, because I don't have time [00:32:00] to be analyzing for trades this year. I was given it that screenshot and asking it, "Who's whose team?"
It was putting players on my team who have been out of the NFL for five years. They were just famous fantasy football players from before. It's just filling them in. Even the players that's getting right, it's putting them on the wrong teams.
Again, you know how it is. There so confident. It never said, "Hey, I got confused. I got overwhelmed. You [00:32:30] shouldn't trust me. What I'm about to tell you is really suspect." Nope.
But the same story with HTML. If I gave it the HTML file, representing that same 12 grid of rosters, it had no trouble with that. We've gotten used to the idea these things are really, really good at absorbing images as input. But we're not typically giving it an image of the dimensions and the complexity that I was giving it. It was [00:33:00] jaw-dropping.
I shared it with Brian Julius, he put it into one of his favorite models of the day and came back and said, "No, one shotted it. It was perfect." I'm like, "Nope." I went and looked through it, like, no, no. It just looked like it got it right. It was again doing the same thing. It was reporting garbage, but it looked credible.
Justin Mannhard...: Almost everything that comes out the other end-
Rob Collie: That's the issue.
Justin Mannhard...: This client's looking for some help to incorporate this idea that they've baited into some other ideas. What's cool about it is seems like there's far too much data to key. [00:33:30] It was just a non-starter.
Now, we've got a technology that seemingly can extract that information, but then also understand the target format it needs to get recorded in so we can do something else with it, and then it can handle like, "Oh, this one looks kind of funny. The puzzle I'm trying to solve looks like this. Okay, I'll do this." And they've now unlocked this idea that they can leverage some information that they've never been able to leverage, other than ad hoc ways.
And then the other example [00:34:00] was essentially reviews. Again, understanding human intent is review comes in and gets recorded in a database, read that review and determine if the client was happy or upset or whatever. And sure, we could have thrown old school machine learning algorithms at that, but these AI, they're just so good at this.
Rob Collie: Yeah, they're built-
Justin Mannhard...: They're built for this.
Rob Collie: They are built to understand intent. The old machine learning algorithms would require [00:34:30] either to be pre-trained or you have to give them an examples like, "These 50 were positive, these 50 were negative." But the modern LLMs take text, take words, take sentences, and turn them into their unambiguous essence of meaning.
They really don't hallucinate in terms of understanding what you said. That's one place where it's just locked in. It's better than a human at understanding what you're saying.
Justin Mannhard...: And so the value there is, again, [00:35:00] if you're getting a high volume of those things and you're trying to find patterns and categorize them and look for, that's a lot of data scrubbing work that was really hard to do before.
Rob Collie: And again, I think it's a very feasible problem for AI systems to digest. Again, if you don't give them all the files at once, give them one at a time. If it's digesting the contents, the structure of the PDF, in the same way that Power Query understands the structure of a PDF, it can be very, very [00:35:30] reliable in those circumstances.
Now, our hockey league has all of these old digitized images of old score sheets. Those are images, and I wonder if on a one-off basis those could be digested and turned into data, because we have this problem in indie inline where there's the historical era, where they weren't entering game-level stats into any system, and they end up with this two-fact table situation.
And Justin, it's just so ugly. We need to unite the [00:36:00] regimes, so maybe it's time to try that out as a pet project.
Justin Mannhard...: Sign me up.
Sign up to receive email updates
Enter your name and email address below and I'll send you periodic updates about the podcast.
Subscribe on your favorite platform.