Hacker News

goplayoutside•10m

SpreadsheetLLM: Encoding Spreadsheets for Large Language Models arxiv.org

72 comments

chatmasta•10m
At Databricks summit there was a nice presentation [0] by the CEO of V7 labs who showed a demo of their LLM + Spreadsheet product.
The kneejerk reaction of “ugh, LLM and spreadsheet?!” is understandable, but I encourage you to watch that demo. It makes clear some obvious potentials of LLMs in spreadsheets. They can basically be an advanced autofill. If you’ve used CoPilot in VSCode, you understand the satisfaction of feeling like an LLM is thinking one step ahead of you. This should be achievable in spreadsheets as well.
[0] https://youtube.com/watch?v=0SVilfbn-HY&t=1251 (queued to demo at 20:51)
- usrbinbash•10m
  > If you’ve used CoPilot in VSCode, you understand the satisfaction of feeling like an LLM is thinking one step ahead of you
  That "satisfaction" vanished pretty damn quickly, once I realised that I have often more work correcting the stuff so generated than I would have had writing it myself in the first place.
  LLMs in programming absolutely have their uses, Lots of them actually, and I don't wanna miss them. But they are not "thinking ahead" of the code I write, not by a long shot.
  - ramraj07•10m
    I really don’t know what the detractors of Copilot are writing, the next StuxNet? Whether I’m doing stupid EDA or writing some fairly original frameworks Copilot has always been useful to me writing both boilerplate code as well as completing more esoteric logic. There’s definitely a slight modification I have made in how I type (making variable names obvious, stopping at the right moment knowing copilot will complete the next etc) but if anything it has made me a cleaner programmer who writes 50% less characters at the minimum.
    - fhd2•10m
      While it could be that you and them work on different kinds of code, I believe it's just as likely that you're just different people with different experience and expectations.
      A "wow, that's a great start" to one could be a "damn there's an issue I need to fix with this" to another. To some, that great start really makes them more productive. To others that 80% solution slows them down.
      For some reason, programmers just love to be zealots and run flamewars to promote their tool of choice. Probably because they genuinely experience it's fantastic for them, and the other guy's tool wasn't, and they want them to see the light, too.
      I prefer to judge people on the quality of their output, not the tools they use to produce it. There's evidently great code being written with uEmacs (Linux, Git), and I assume that, all the way on the other end of the spectrum, there's probably great code being written with VSCode and Copilot.
    - hlfshell•10m
      In my experience using LLMs like CoPilot:
      Web server work in Go, Python, and front end work in JavaScript - it's pretty good. Only when I try to do something truly application specific that it starts to get tripped up.
      Multi threading python work - not bad, but occasionally makes some mistakes in understanding scope or appropriate safe memory access, which can be show stopping.
      Deep learning, computer vision work - it gets some common pytorch patterns down pat, and basic computer vision work that you'd typically find tutorials for but struggles on any unique task.
      Reinforcement learning for simulated robotics environments - it really struggles to keep up.
      ROS2 - fantastic for most of the framework code for robotics projects, really great and recommended for someone getting used to ROS.
      C++ work - REALLY struggles with anything beyond basic stuff. I was working with threading the other day and turned it off as all of its suggestions would never compile let alone do anything sensible.
  - nunodonato•10m
    They are with me. And with many other people. Perhaps it's the quality of your code that is preventing better completions.(Or the lang you use?)
    There are a few things that really help the AI to understand what you want to do, otherwise it might struggle and come up with not so good code.
    Not to say it gets it right everytime, but definitely often enough for me not to even consider turning it off. The time save has been tremendous.
    - EGreg•10m
      I can see vague blaming the person becoming more the norm when LLMs are responsible for precrime and restrictions etc.
      “Oh you couldn’t take a train to work? Must have been something you did, the Palantir is usually great and helps our society. It always works great for me and my friends.”
      - bongodongobob•10m
        Nah, that's not it. It's more like complaining that someone has to drive the train and therefore "is completely useless to me, it can't read my mind so it's trash".
  - solumunus•10m
    That's because you were using CoPilot. Try a much better option such as Supermaven. I unsubscribed from CoPilot for similar reasons but after using Supermaven for 3-4 months I will never cancel this subscription unless something better comes along. It's way more accurate and way faster.
  - Kiro•10m
    That's not my experience at all. I very seldom need to correct anything Copilot outputs.
  - pydry•10m
    >LLMs in programming absolutely have their uses
    Absolutely. LLMs let you make more programming mistakes faster than any other invention with the possible exceptions of handguns and Tequila.
    To be fair, it is also really good at spewing out industrial levels of boilerplate. As we all know, 99% of the effort in coding is the writing of code and the more boilerplate in your code base the better. /s
- dimal•10m
  > If you’ve used CoPilot in VSCode, you understand the satisfaction of feeling like an LLM is thinking one step ahead of you
  I did not get that feeling from CoPilot. I usually got the feeling that it was interrupting me to complete my thought but getting it wrong. It was incredibly annoying and distracting. Instead of helping me to think it was making it harder to think. Pair programming with an LLM has been great. Better than with most humans. But autocomplete sucks for me.
- ssl-3•10m
  Seems like a reasonably-cromulent use-case -- or at least, it fits in with my own uses of LLMs.
  I suck at spreadsheets. I know they can do both useful and amazing things, but my daily life does not revolve around spreadsheets and I simply do not understand most of the syntax and operations required to make even fairly basic things work. It requires a lot of time and effort for me to get simple things done with a spreadsheet on the rare occasion that I need to manipulate one.
  There are things in life that I am very good at; spreadsheets are simply not amongst them.
  But do I know what I want, and I generally even have a ballpark idea of what the results should look like, and how to calculate it by hand [horror]. I just don't always know how to articulate it in a way that LibreOffice or Google Sheets or whatever can understand.
  LLMs have helped to bridge that gap for me, but it's a pain in the ass: I have to be very careful with the context that I give the LLM (because garbage in is garbage out).
  But in the demo, the LLM has the context already. This skips a ton of preamble setup steps to get the LLM ready to provide potentially-useful work, and moves closer to just making a request and getting the desired output.
  Having one unified interface saves even more steps.
  (And no, this isn't for everyone.)
- delusional•10m
  I don't think I understand that demo. It shows him using some built-in workflow thing (which isn't generally considered a core part of a spreadsheet) and then asks some LLM about the total price (I guess asking it to do math, which LLM's are notoriously bad at), but instead it looks like he gets some responses telling him what the term "total price" means, in prose that doesn't fit in the cells.
  What was i supposed to take away from that demo?
  - jemmyw•10m
    The llm doesn't do the math. It outputs something the app then interrupted into a cell configuration with sums filled in. This is an area where llms can be quite good, you type out how you want to report the data like "give me subtotals of column F at every month of the date column E and a grand total of F at the bottom"
    Except sometimes you can't seem to stop the prose.
- Kiro•10m
  Thank you. Tired of the usual jokers in threads like this. Right now the majority of comments are all sarcastic snark.
  - grape_surgeon•10m
    I'm new here; Hacker News is supposed to avoid the modern Reddit trap but feel it often falls into it. The topics are more relevant to me but the comments are often unbearably cynical and excessively dismissive
    - bubblyworld•10m
      Yeah, I feel like there's a real culture problem on HN right now, especially for topics that have received any degree of hype (AI and crypto, mainly). People can be excessively rude for no reason if you express an outsider view. Gets to the point where I can't trust anyone here to engage with me in good faith (the exceptions are a welcome blessing).
      I've come close to blocking the site on my network many times but it's an absolute goldmine of interesting info too... I'm not really sure if there's a solution, other than to practice emotionally disengaging from internet discussions.
  - nforgerit•10m
    We jokers are tired as well
- userbinator•10m
  If you’ve used CoPilot in VSCode, you understand the satisfaction of feeling like an LLM is thinking one step ahead of you.
  Tried it once. Didn't get "satisfaction"; instead felt deeply irritated by the "backseat driver". Maybe it works better if you're just churning out mediocre boilerplate.
  - Closi•10m
    I’m churning out mediocre boilerplate here and it’s working great! Managing to build things crazy fast.
  - thomashop•10m
    Like any tool, you need to learn how to use it and iterate with it. Trying it once is not enough.
  - rsynnott•10m
    Yeah, for me it feels, as a proposition, and having used it briefly before turning it off, like “what if you could pair program with an ultra-confident, yet dangerously incompetent, intern, forever”?
    - dr_dshiv•10m
      … with unlimited stamina, patience and capacity for negative feedback. If it was forever, you’d probably learn to take advantage of that resource!
- rsynnott•10m
  > If you’ve used CoPilot in VSCode, you understand the satisfaction of feeling like an LLM is thinking one step ahead of you.
  I’m not sure it’s so much ‘satisfaction’; it felt more like I was having a stroke until I turned it off. Its suggestions were, like, plausibly code, but completely contextually nonsensical in general; frankly IntelliJ’s old autocomplete-with-guessing functionality was better, as it at least _knows_ a certain amount about the codebase. Now, this was on a very large old codebase; no doubt it’s better if writing trivial new things.
- bongodongobob•10m
  The comments here are absolute existential crisis. "Only I do spreadsheets good!" I agree, this looks really neat.
- Havoc•10m
  It has some challenges ahead still:
  https://i.redd.it/xr8uxqayv68d1.jpeg
  For demos sure, but I'm not super hopeful on this frankly. LLMs are inherently about generating next char in sequential order. Nothing about real world spreadsheets is linear like that - they're all interlinked chaos.
  - Kiro•10m
    As the comments correctly point out that image is like 10 years old and has nothing to do with LLMs.
- __loam•10m
  > you understand the satisfaction of feeling like an LLM is thinking one step ahead of you
  Yes, "satisfaction"
- mwadhera•10m
  See also: https://matrices.app
bsenftner•10m
I've found that all the top foundation models already understand spreadsheets very well, as well as all the functions, as well as all the common spreadsheet problems people run into using them. The Internet is chock full spreadsheet support forums and tutorials, and the foundation models have all been trained on this data.
With not very much effort, one can explain to an LLM "here is a spreadsheet, formatted as..." which takes about 150 word tokens, and then not much more mental effort in your favorite language to translate an arbitrary spreadsheet into that format, and one gets a very capable LLM interface that can help explain complex arbitrary spreadsheets as well as generate them on request.
I've got finance professionals and attorneys using a tool I wrote doing this to help them understand and debug complex spreadsheets given to them by peers and clients.
- ec109685•10m
  The issue was that before, large spreadsheets would overflow the context so this “compression” technique helps the LLM do more from the same data.
  - bsenftner•10m
    Which strikes me as an ingenious method of locking in their customers with a proprietary compressed format only their finetuned LLMs can parse.
IanCal•10m
I love the deep technical discussions on HN, and I'm disappointed to see anything AI related start to just resemble Reddit threads of people with knee jerk reactions to the title.
This is interesting, it's about how you can represent spreadsheets to llms.
- nunodonato•10m
  Yes, for some reason we really have an established hate club around here. And the comments are usually the same thing everytime
galaxyLogic•10m
How will it work?
I open an Excel spreadsheet and also the AI Copilot. Then whenever I want to do something with Excel like "Show me which cells have formulas" CoPilot will interact with Excel and issue some command I cannot remember to do that for me?
Menus are good but often hard to navigate and find. So the CoPilot can give me a whole new (prompt-based) user-interface to any MS-application? Is that how it works?
nickpinkston•10m
I'm now so reliant on ChatGPT for gSheets, that I'd be almost unable to maintain my sheets' absurd formulas without it.
It's also really accelerated my knowledge / skills of the specifics of the excel language.
Having an LLM being able to directly read/write at the sheet level, instead of just generating formulas for one cell, would be amazing.
goplayoutside•10m
https://venturebeat.com/ai/microsofts-new-ai-system-spreadsh...
blueyes•10m
The real trick would be for LLMs, which currently do math very poorly, to simply send "math to be done" into a spreadsheet, and retrieve the results... (If anyone is aware of an LLM that's great at math and physics, pls lmk!!)
mitjam•10m
Spreadsheets can fill the gap between ad-hoc prompting/prompt workbooks and custom software for special business tasks.
By using a prompt function like LABS.GENERATIVEAI in Excel you can create solutions that combine calculations, data, and Generative AI. In my experience, transforming data to and from CSV works best for prompting in spreadsheets. Getting data to and from CSV format can be done with other spreadsheet functions.
I've created a book and course (https://mitjamartini.com/resources/ai-engineering/ebooks/han...) that teaches how to do this (both more beginner level). Just working through the examples or the examples provided by Anthropic for Claude for Sheets should be enough to get going.
christianqchung•10m
Goodness, I've been making a joke that AI companies are going to spend 500 billion dollars to make spreadsheet generators since 2023, and now it's becoming real. Gemini has a limited form of this too.
ilaksh•10m
Is there a github?
- janpmz•10m
  It should become a standard for papers to publish their code. After all it's a publication.
jimkoen•10m
@ludicity 's head is going to explode.
pavel_lishin•10m
Ah yes, Excel, the piece of software that already famously mangles data, is now going to be glued to software that also famously mangles data.
Honestly, though, I kind of kid - I love spreadsheets, and if this actually works, it could be interesting. God help whoever needs to troubleshoot the hallucinated results - it's already hard enough to figure out what byzantine knotwork someone created using existing Excel functions, but now we'll have to also guess and second-guess layers of prompts that were used to either generate those same functions, or just generate output that got mulched through some AI black-box.
- Obscurity4340•10m
  Spreadsheets have such a way of making chaotic things more clear. I wonder if there's any work on spreadsheets as a multidimensional thinking tool
  - laborcontract•10m
    Indeed. I accidentally taught myself linear algebra through spending a ridiculous amount of time in excel. I only realized that after taking a linear algebra class and feeling helpless.. until I mentally remapped the concepts into excel space, after which it all became easy.
    - mettamage•10m
      Could you give an example?
      - laborcontract•10m
        Sure. So back in an old finance job I was given a whole bunch of portfolio modeling spreadsheets that was a huge mess of ad hoc column so which drove me nuts so everything started with me learning how to use arrays, which significantly reduced the complexity of basic data transformations.
        But then I wanted to analyze all our portfolio data over time so i had to figure out then how handle multi dimensionality in my spreadsheets. Then I figured out how to integrate and transform and reduce portfolio characteristics into sensible components for risk management and portfolio optimizations across different asset classes.
        I figured out how to do some absolutely ridiculous stuff in excel, it’s tough for me to think of tools that scratch the surface me if l that is nearly as good at helping working through
  - •10m
    [deleted]
skywhopper•10m
Uhh, this is a paper about how to compress spreadsheet data to fit inside an LLM’s token limits, including such novel approaches as ignoring exact values of numbers, meaning of data types, and any context outside of a detected table of values.
The paper doesn’t speak at all to actual uses of this approach, but that doesn’t stop the article writer from assuming this is probably a big step towards automated tools that analyze spreadsheet data for non-numerically inclined users.
This is not that.
- dang•10m
  "The article writer" refers to https://news.ycombinator.com/item?id=40996429 - we merged that thread hither.
  supporting correct context for comments everywhere
fimdomeio•10m
Congratulations everyone, we can now automate the next global financial crisis.
- lainga•10m
  Now imagine an LLM trained on LLM web content. We call that an AIslop-squared.
surfingdino•10m
I am calmly waiting for the SEC to rip them a new hole the size of Manhattan when hallucinated spreadsheets inevitably make their way into listed companies' reports.
- rsynnott•10m
  That would be on the companies (or their auditors), not Microsoft, in general. Clearly, no-one should ever _use_ this, should it ever make it out of research-land, but there's not that much obvious risk to _making_ it as long as they're honest about the risks.
  - SkyBelow•10m
    If it could spit out the analysis as spreadsheets that used standard formulas and only use the LLM to generate the formulas, it could be verified. Errors would slip through, but no worse than people applying the wrong formula based on a quick internet search that calculates a close but incorrect answer.
    - KoolKat23•10m
      Agreed, The workflow could also include a person checking it before submitting, just skimming through for errors.
  - michaelmior•10m
    > Clearly, no-one should ever _use_ this
    …without validating the results. Otherwise why should we ever use LLMs for anything?
    - surfingdino•10m
      That's not the marketing message at the moment. I see ads for AI (LLM) powered services and they all say the same thing, "Stressed? Not enough time? Let AI do it faster so you can do more." AI is sold as a tool that can do things faster than a human can and since LLMs do not provide reference information, there is no telling where they got the data from and no way to verify it.
      - michaelmior•10m
        > LLMs do not provide reference information
        Many do, although it's true that's often not the case.
    - rsynnott•10m
      Validating that a complex spreadsheet is correct is notoriously extremely difficult at the best of times; unfortunately they are about the closest thing to a write-only language in common use, and you really have to front load a lot more care than you do in conventional modern languages. The usual safeguards of testing and code review are essentially absent.
      I’m sceptical that anyone really _should_ be using generative AI for anything where correctness matters at all, but spreadsheets in particular seem close to a worst-case scenario.
  - RodgerTheGreat•10m
    There may not be much risk from a legal culpability perspective if they make the appropriate disclosures somewhere in the depths of a EULA, but even so it is a failure of professional ethics to build tools which are "dangerous at any speed" and inflict them upon the world.
    - rsynnott•10m
      Oh, I don’t disagree, and I don’t think burying it in the EULA would necessarily be sufficient (especially in Europe, where the courts and regulators have tended to take a dim view of “but we told you, in three-point type on page 473 in the middle of the trademark acknowledgements”). But ultimately the blame for using known-unreliable tools is largely on the user.
    - surfingdino•10m
      They want to make back the money they invested in OpenAI. We'll be seeing similar "research" (repackaging ChatGPT) from Microsoft for a while.
- bongodongobob•10m
  Well you'll be waiting a very very long time.
- •10m
  [deleted]
oslis•10m
[flagged]
ffhhj•10m
Waiting for the hallucinate formula:
=HAL(9000)
victor9000•10m
Why on earth would you task a non-deterministic technology with data persistence?
- bubblyworld•10m
  The universe is fundamentally a non-deterministic technology, friend. We do what we can =D
cyanydeez•10m
DNA rEsearchers HAD TO STOP USING their preferrdd letter sequences because excel would autocorrect its typs.
This.will not end. Well