I think calling it a "drive" will confuse a lot of people. I expected a device driver, and not technical people have no idea of drives, they just have documents and photos.
I tried with a text file
X Y
1 2
3 4
and for some reason the convert version has 1 2 3 4 in the same row.
gus_massa 54 days ago [-]
Too late to edit: Google drive is call "drive", so I guess it's more usual than I noticed.
alentred 52 days ago [-]
Hm, I don't know, I am OK with "drive". Google Drive, Microsoft OneDrive, iCloud Drive.
grandslammer 51 days ago [-]
It really does just make sense too. We have hard drive so I wouldn’t we have digital drives.
globular-toast 52 days ago [-]
I like the idea if it was deterministic. So if there are standard ways to convert to/from document types, like Pandoc, being able to write to any one of them and have it update the rest would be interesting.
I hate it if it's built with "AI". Can't imagine a use case for this apart from just shit you don't care about. Why would I be hoarding data I don't care about?
grandslammer 51 days ago [-]
It’s not about hoarding data rather it’s about the malleability of the data itself so for me, I’m constantly working with data but need to format the way that it is displayed whether it’s the file type or the way that the data is given in the specific file if it’s a CSV for example so an application like this allows me to quickly reformat the files with natural language to maybe make them an HTML where I could share a certain form of document HTML file or take that information and reverted into a CSV format. I need to configure it with a management system or something like that.
jeroenhd 51 days ago [-]
You might not want to use https://anydocai.com/result/<incremental number> for URLs like that. Anyone can enumerate the ~300 files from the home page and look at what others have uploaded.
That said, the website doesn't seem to work anymore. It just errors out.
grandslammer 51 days ago [-]
Also, as far as the enumeration users are only authorized to access the files that they’ve created in our system, but I should definitely obscure the file count
grandslammer 51 days ago [-]
I’m like a mid-level developer though so if I messed up the authorization access and you worked around it in someway if you let me know that would be sick @boshjerns on X
grandslammer 51 days ago [-]
I didn’t expect this to go semi viral on here so I just refilled the credits. It actually ran out of credits for open AI.
woleium 49 days ago [-]
gemini is cheaper, probably
pavel_lishin 51 days ago [-]
I wonder if they ran out of credits.
grandslammer 51 days ago [-]
Yeah, this is exactly what happened. I did not expect this or catch up until just now, but I just fixed it.
Y_Y 51 days ago [-]
I thought this was something like a FUSE driver that would on-the-fly generate any file you tried to read, with some consistency. Like if you open stories/zombie-party.txt it will have some generative network make it, and cache it. If you later ask for stories/zombie-party.odt it can just do a conversion.
I vibe-coded a demo of such a thing, with the idea of making game assets like textures/outdoor/wall.jpg etc. You can do it easily enough, but you need to be patient, and not particularly discerning.
raphman 51 days ago [-]
FWIW, I wrote a small paper on this general topic a few years ago, collecting earlier work and own ideas.
"Files as Directories: Some Thoughts on Accessing Structured Data within Files"
Thanks! Sorry - I didn't realize that the paper in the ACM DL is not open-access.
quesera 51 days ago [-]
I played with this idea for media servers.
I want iTunes and Audiobookshelf and beets and Jellyfin, etc to all work on the same filesystem and media archive.
There are challenges.
RileyJames 52 days ago [-]
It’s an interesting idea.
I’ve definitely felt the pain of file formats in some unexpected ways recently.
Like airdropping a photo from iPhone only to discover a .HEIC file, which nothing will accept.
I’ve previously used “what ever turns up first on google”, but I now won’t for anything of significance (privacy)
I’ve recently discovered Automator (on Mac) and the quick actions menu. Which can achieve a lot of image and pdf related conversions, but takes some setup (not a mass market solution)
I like the idea of this product. But I think the challenge will be:
- reaching the user at the moment they have this problem
- making your solution frictionless to solve their immediate problem, while also bootstrapping to solve it next time around (without them forgetting it exists)
If you can nail that experience for a single use case, I think this will be a winner.
grandslammer 51 days ago [-]
Hey, I’m just catching up here and I really appreciate the feedback and I’m gonna work to integrate all this feedback into the application and repost about it again I really appreciate you
DontchaKnowit 51 days ago [-]
I think the real problem is getting it to actually work....
grandslammer 51 days ago [-]
i think i hit credit limits because so many people were using the app all of a sudden and i'm just like using my own funds for api costs and had a cap on my openai account
grandslammer 51 days ago [-]
Let me actually work on this HEIC issue right now. I think that I know a fix for this.
unsnap_biceps 52 days ago [-]
Have you considered writing this as a FUSE system rather than a web service?
TomMasz 51 days ago [-]
I got "No video with supported format and MIME type found." in the How It Works section.
grandslammer 51 days ago [-]
Video files are not supported right now is probably the issue. Working on this because I’m going to have to pass the video into frames and then feed the frames into a model and I just need to work this out a little bit more
troyvit 51 days ago [-]
I think you can get past that if you download the video, then upload it back up to anydoc and ask it to translate it to Markdown.
edit: /s
grandslammer 50 days ago [-]
hahahaahhaha nice one
sigmaisaletter 52 days ago [-]
It's a fancy file format conversion utility.
Am I missing something?
dsr_ 51 days ago [-]
Yes: it's a fancy file format conversion service that adds errors so your QA people have more work.
voidUpdate 52 days ago [-]
So I can have my exe convert into a shapefile and an mp3?
voidUpdate 52 days ago [-]
Well I tried to convert an exe into a pptx and it outputted a file that looked like an attempt at html, saying that the conversion wasn't feasible due to its nature and size
lloeki 51 days ago [-]
You have it the wrong way around? Usually you are handed a pptx by customers and your job is to turn that into an exe.
voidUpdate 51 days ago [-]
I mean the website cays it can convert anything to anything ("every file exists as all file types all of the time") so it should be able to do exe to pptx
thebeardisred 51 days ago [-]
You missed the joke
jeroenhd 51 days ago [-]
Based on important research like https://www.youtube.com/watch?v=uNjxe8ShM-8, it should definitely be possible to generate a .pptx that will run Windows inside of an emulator inside PowerPoint slides. That HTML file is lying to you!
grandslammer 51 days ago [-]
i need to work on the pptx conversions and some of the other file types specifically- now i'm on this
grandslammer 51 days ago [-]
I’m actually working on some really interesting conversions right now
ramoz 52 days ago [-]
AI as a use case doesn’t make sense to me.
You’re using AI to create a transpilation of whatever modality. It’s a wasted step if the purpose is to feed back into AI.
grandslammer 51 days ago [-]
I literally find myself using this tool every day because I need to use natural language to reformat files and the data that are in the files like CSV‘s or markdown so maybe this isn’t useful for you but it definitely is useful to have the LM be able to interpret your natural language to redesign the file the way that you want it to give the information
ramoz 50 days ago [-]
You're talking about a commodity interaction at this point your tool offers nothing different from a chatbot other than your confusing semantics and abstraction.
What Im saying: If the point is to "convert this csv to markdown so i can feed the markdown to a LLM to ask questions about it" etc... it is a completely unnecessary step.
Your service is nothing more than:
1. augmented metadata for files; btw if that requires a whole new drive-oriented solution then you're doing too much.
2. llm api wrapper for a commoditized capability (custom format/or transpilation)
grandslammer 50 days ago [-]
The friction isn’t “can I call an LLM,” it’s every time I want to do anything with this file I have to:
open it in a tool that understands the format,
export / paste the part I care about,
phrase an LLM prompt,
paste the result back,
do this all again if i want the data formatted differently for different use cases.
adding the ability to format your data and view/download that natively, fast is like giving python scripting capabilities to normal users. You're thinking like a dev not like a business owner who may want to take a picture of a timesheet and have that immediately become a CSV then have it reformatted for a management system they use, all on the fly through natural language... there's so many ways that normal people navigate files and formats and I want to give these people some superpowers that they won't seek out themselves.
the gpt-wrapper argument is so played out. just like you’d say “my app is a GPT-wrapper” (it wraps the OpenAI API in a file-centric UX), you could say “Google Drive is a distributed-storage-wrapper” or “a cloud-storage-and-sync wrapper.” It’s the polished frontend and glue that makes the raw backend useful to end users.
cyanydeez 51 days ago [-]
keep in mind, almost all the uses of the current AI are to generate some unstable product that whimsically can change given a butterfly's wings in Japan.
ramses0 51 days ago [-]
Back in the day there were a bunch of `x2y` programs[1], like html2pdf, xls2csv, rst2odt, jpg2png, png2jpg, etc...
You could imagine something like `any2zip`, or `any2tgz` or `iso2mp4` or something.
It seems like there could/should be some sort of virtual filesystem where you could say "cat inventory.xls.csv", or "wine.exe excel.exe inventory.csv.xls" (please bear with me on these examples). Effectively "$BLOB.format.format", where "." becomes a sort of "convert to this $TYPE".
...if you requested `README.md.pdf`, maybe it could intuit the intermediate `md2html2pdf` (HTML) portion?
I really wish local linux filesystems (for end-users) would at least match Apple's capabilities. eg: `$RECENT`, spotlight, auto-OCR. We've really regressed since the era of `locate`, but I'd _LOVE_ some sort of modern equivalent.
Imagine: `inotify`, `auditd`, just anything that can avoid full-disk scans during "normal end user" daily operation... wired up to `llm-summary $FILE >> sqlite.db ; `llava-describe $IMAGE >> sqlite.db ; etc...`
For bonus points, catch anything missed with some sort of full daily/weekly backup operation. We're on the cusp of a much more intimate "partnership" with the compute boxes underneath our desks, but so much is getting sucked into the void of "the network is the computer".
> Back in the day there were a bunch of `x2y` programs[1], like html2pdf, xls2csv, rst2odt, jpg2png, png2jpg, etc...
They're still around. A problem is loss of information on each conversion. For example, wav->mp3 loses info. Converting back (mp3->wav) won't get you the exact .wav you started with. Similar thing with file types supporting different resolution graphics, vector vs. bitmap, metadata being stripped, features in format A not supported in format B, etc.
Another problem is the explosion of M:N file format combinations. A possible fix would be a universal (?) in-between format, functioning as a container for [portions of a file] + whatever metadata was extracted from original. That way you can at least do conversions along the lines of video container formats, where container type is changed but video inside does not get decompressed/re-encoded. Or simular operations like extracting/shuffling pages in a pdf document.
All in all this is not an easy problem & therefore unlikely to be solved anytime soon.
grandslammer 51 days ago [-]
really appreciate you adding to the discourse here - I'm not sure if you got a chance to test out the site but I refilled my credits after the surge of attention and would love if you checked it out! also @boshjerns on X if you want to reach out to chat
nkrisc 52 days ago [-]
So it turns one file into many? Or is it actually one file that is simultaneously a valid HTML document and PNG?
grandslammer 51 days ago [-]
It’s basically an access layer that gives you quick access to all the different conversions of the files in one place, but it also allows you to redesign them with natural language so that you can configure them for your needs on the fly
IAmBroom 52 days ago [-]
According to what I read, the latter.
c0wb0yc0d3r 52 days ago [-]
And you’d be mislead. The video shows the original file is converted to different formats, depending on the user’s selection. The video shows jpeg to html (using AI to perform OCR?).
Pandoc but extra AI steps.
grandslammer 50 days ago [-]
That argument really skips over what most people actually need. Nobody outside of a tech bubble wants to learn half a dozen Pandoc flags, stitch together shell commands and temp files, or write Lua filters just to reshape a document. With our drive layer you literally rename a file or type “make this header bold and export as PDF” and the work just happens, no scripts required.
This isn’t about replacing power-user workflows, it’s about giving anyone on your team the ability to reshape data and documents without ever opening a terminal. You getflexibility with the simple UX of renaming a file. Calling it “Pandoc plus AI” misses the fact that 90 percent of users neither know nor care about Pandoc’s internals. They just want “I have a file, make it look like this, or formatted with these sections to share with X person who works in X field...” and that’s exactly what our natural-language, filesystem-driven approach delivers.
lawlessone 51 days ago [-]
Wouldn't this make every file a lot bigger?
grandslammer 51 days ago [-]
It’s not really saving everything into one file type rather than allowing a layer. That access is all the file types easily and fast.
SPBS 52 days ago [-]
the page is really laggy on Edge, kills any interest in wanting to explore more (strangely, it's much snappier on Chrome)
grandslammer 51 days ago [-]
I will work to fix this asap I just caught up here
emadda 52 days ago [-]
Could have called it quantumdoc
grandslammer 51 days ago [-]
It’s not too late
_wire_ 51 days ago [-]
Imagine no file types
♪ It's easy if you try
No hell below us
Above us only sky
Imagine all the people
Visualize whirled peas
Ah ah ah oooo!
You may say I'm a dreamer...
grandslammer 51 days ago [-]
real one
jy14898 51 days ago [-]
Now make it an HTTP API where content negotiation always succeeds
55 days ago [-]
dankobgd 52 days ago [-]
[flagged]
rpgraham84 52 days ago [-]
[flagged]
dgan 52 days ago [-]
i ve read the title 5 times, and can't make sense of it. Is this even valid English ?
Akronymus 51 days ago [-]
>Imagine a drive[,] where every file exists[,] as all file types[,] all of the time
Basically treating one file type as if it were any arbitrary other file type
quesera 51 days ago [-]
Punctuated like that, I can't help reading it in the movie trailer guy[0] voice.
I tried with a text file
and for some reason the convert version has 1 2 3 4 in the same row.I hate it if it's built with "AI". Can't imagine a use case for this apart from just shit you don't care about. Why would I be hoarding data I don't care about?
That said, the website doesn't seem to work anymore. It just errors out.
I vibe-coded a demo of such a thing, with the idea of making game assets like textures/outdoor/wall.jpg etc. You can do it easily enough, but you need to be patient, and not particularly discerning.
"Files as Directories: Some Thoughts on Accessing Structured Data within Files"
https://dl.acm.org/doi/pdf/10.1145/3191697.3214323
Is this paper freely available somewhere?
Link found here: https://scholar.google.com/scholar?cluster=14832107127874645...
I want iTunes and Audiobookshelf and beets and Jellyfin, etc to all work on the same filesystem and media archive.
There are challenges.
I’ve definitely felt the pain of file formats in some unexpected ways recently.
Like airdropping a photo from iPhone only to discover a .HEIC file, which nothing will accept.
I’ve previously used “what ever turns up first on google”, but I now won’t for anything of significance (privacy)
I’ve recently discovered Automator (on Mac) and the quick actions menu. Which can achieve a lot of image and pdf related conversions, but takes some setup (not a mass market solution)
I like the idea of this product. But I think the challenge will be: - reaching the user at the moment they have this problem
- making your solution frictionless to solve their immediate problem, while also bootstrapping to solve it next time around (without them forgetting it exists)
If you can nail that experience for a single use case, I think this will be a winner.
edit: /s
Am I missing something?
You’re using AI to create a transpilation of whatever modality. It’s a wasted step if the purpose is to feed back into AI.
What Im saying: If the point is to "convert this csv to markdown so i can feed the markdown to a LLM to ask questions about it" etc... it is a completely unnecessary step.
Your service is nothing more than:
1. augmented metadata for files; btw if that requires a whole new drive-oriented solution then you're doing too much.
2. llm api wrapper for a commoditized capability (custom format/or transpilation)
open it in a tool that understands the format,
export / paste the part I care about,
phrase an LLM prompt,
paste the result back,
do this all again if i want the data formatted differently for different use cases.
adding the ability to format your data and view/download that natively, fast is like giving python scripting capabilities to normal users. You're thinking like a dev not like a business owner who may want to take a picture of a timesheet and have that immediately become a CSV then have it reformatted for a management system they use, all on the fly through natural language... there's so many ways that normal people navigate files and formats and I want to give these people some superpowers that they won't seek out themselves.
the gpt-wrapper argument is so played out. just like you’d say “my app is a GPT-wrapper” (it wraps the OpenAI API in a file-centric UX), you could say “Google Drive is a distributed-storage-wrapper” or “a cloud-storage-and-sync wrapper.” It’s the polished frontend and glue that makes the raw backend useful to end users.
You could imagine something like `any2zip`, or `any2tgz` or `iso2mp4` or something.
It seems like there could/should be some sort of virtual filesystem where you could say "cat inventory.xls.csv", or "wine.exe excel.exe inventory.csv.xls" (please bear with me on these examples). Effectively "$BLOB.format.format", where "." becomes a sort of "convert to this $TYPE".
Imagine being able to say:
...if you requested `README.md.pdf`, maybe it could intuit the intermediate `md2html2pdf` (HTML) portion?I really wish local linux filesystems (for end-users) would at least match Apple's capabilities. eg: `$RECENT`, spotlight, auto-OCR. We've really regressed since the era of `locate`, but I'd _LOVE_ some sort of modern equivalent.
Imagine: `inotify`, `auditd`, just anything that can avoid full-disk scans during "normal end user" daily operation... wired up to `llm-summary $FILE >> sqlite.db ; `llava-describe $IMAGE >> sqlite.db ; etc...`
For bonus points, catch anything missed with some sort of full daily/weekly backup operation. We're on the cusp of a much more intimate "partnership" with the compute boxes underneath our desks, but so much is getting sucked into the void of "the network is the computer".
[1]: compgen -c | grep 2 | grep -v '2$' | grep -v '\.2' | grep -v '2\.'
[2]: https://en.wikipedia.org/wiki/Locate_(Unix)
They're still around. A problem is loss of information on each conversion. For example, wav->mp3 loses info. Converting back (mp3->wav) won't get you the exact .wav you started with. Similar thing with file types supporting different resolution graphics, vector vs. bitmap, metadata being stripped, features in format A not supported in format B, etc.
Another problem is the explosion of M:N file format combinations. A possible fix would be a universal (?) in-between format, functioning as a container for [portions of a file] + whatever metadata was extracted from original. That way you can at least do conversions along the lines of video container formats, where container type is changed but video inside does not get decompressed/re-encoded. Or simular operations like extracting/shuffling pages in a pdf document.
All in all this is not an easy problem & therefore unlikely to be solved anytime soon.
Pandoc but extra AI steps.
This isn’t about replacing power-user workflows, it’s about giving anyone on your team the ability to reshape data and documents without ever opening a terminal. You getflexibility with the simple UX of renaming a file. Calling it “Pandoc plus AI” misses the fact that 90 percent of users neither know nor care about Pandoc’s internals. They just want “I have a file, make it look like this, or formatted with these sections to share with X person who works in X field...” and that’s exactly what our natural-language, filesystem-driven approach delivers.
♪ It's easy if you try No hell below us Above us only sky Imagine all the people Visualize whirled peas Ah ah ah oooo!
You may say I'm a dreamer...
Basically treating one file type as if it were any arbitrary other file type
[0] https://en.wikipedia.org/wiki/Don_LaFontaine .. wow, dead for 17 years!