Vista Normal

Hay nuevos artículos disponibles. Pincha para refrescar la página.
AnteayerSalida Principal

Convert Any Book to a DIY Audiobook?

6 Julio 2025 at 08:00

If the idea of reading a physical book sounds like hard work, [Nick Bild’s] latest project, the PageParrot, might be for you. While AI gets a lot of flak these days, one thing modern multimodal models do exceptionally well is image interpretation, and PageParrot demonstrates just how accessible that’s become.

[Nick] demonstrates quite clearly how little code is needed to get from those cryptic black and white glyphs to sounds the average human can understand, specifically a paltry 80 lines of Python. Admittedly, many of those lines are pulling in libraries, and some are just blank, so functionally speaking, it’s even shorter than that. Of course, the whole application is mostly glue code, stitching together other people’s hard work, but it’s still instructive and fun to play with.

The hardware required is a Raspberry Pi Zero 2 W, a camera (in this case, a USB webcam), and something to hold it above the book. Any Pi with the ability to connect to a camera should also work, however, with just a little configuration.

On the software side, [Nick] pulls in the CV2 library (which is the interface to OpenCV) to handle the camera interfacing, programming it to full HD resolution. Google’s GenAI is used to interface the Gemini 2.5 Flash LLM via an API endpoint. This takes a captured image and a trivial prompt, and returns the whole page of text, quick as a flash.

Finally, the script hands that text over to Piper, which turns that into a speech file in WAV format. This can then be played to an audio device with a call out to the console aplay tool. It’s all very simple at this level of abstraction.

Yes, we know it’s essentially just doing the same thing OCR software has been doing for decades. Still, the AI version is remarkably low-effort and surprisingly accurate, especially when handling unusual layouts that confound traditional OCR algorithms. Extensions to this tool would be trivial; for example, adjusting the prompt to ask it to translate the text to a different language could open up a whole new world to some people.

If you want to play along at home, then head on over to the PageParrot GitHub page and download the script.

If this setup feels familiar, you’d be quite correct. We covered something similar a couple of years back, which used Tesseract OCR, feeding text to Festvox’s CMU Flite tool. Whilst we’re talking about text-to-speech, here’s a fun ESP32-based software phoneme synthesiser to recreate that distinctive 1980s Speak & Spell voice.

AI Might Kill Us All (With Carbon Emissions)

4 Julio 2025 at 02:00

So-called artificial intelligence (AI) is all the rage right now between your grandma asking ChatGPT how to code in Python or influencers making videos without having to hire extras, but one growing concern is where the power is going to come from for the data centers. The MIT Technology Review team did a deep dive on what the current situation is and whether AI is going to kill us all (with carbon emissions).

Probably of most interest to you, dear hacker, is how they came up with their numbers. With no agreed upon methods and different companies doing different types of processing there were a number of assumptions baked into their estimates. Given the lack of information for closed-source models, Open Source models were used as the benchmark for energy usage and extrapolated for the industry as a whole. Unsurprisingly, larger models have a larger energy usage footprint.

While data center power usage remained roughly the same from 2005 to 2017 as increases in efficiency offset the increase in online services, data centers doubled their energy consumption by 2023 from those earlier numbers. The power running into those data centers is 48% more carbon intensive than the US average already, and expected to rise as new data centers push for increased fossil fuel usage, like Meta in Louisiana or the X data center found to be using methane generators in violation of the Clean Air Act.

Technology Review did find “researchers estimate that if data centers cut their electricity use by roughly half for just a few hours during the year, it will allow utilities to handle some additional 76 gigawatts of new demand.” This would mean either reallocating requests to servers in other geographic regions or just slowing down responses for the 80-90 hours a year when the grid is at its highest loads.

If you’re interested in just where a lot of the US-based data centers are, check out this map from NREL. Still not sure how these LLMs even work? Here’s an explainer for you.

Hackaday Links: June 29, 2025

29 Junio 2025 at 23:00
Hackaday Links Column Banner

In today’s episode of “AI Is Why We Can’t Have Nice Things,” we feature the Hertz Corporation and its new AI-powered rental car damage scanners. Gone are the days when an overworked human in a snappy windbreaker would give your rental return a once-over with the old Mark Ones to make sure you hadn’t messed the car up too badly. Instead, Hertz is fielding up to 100 of these “MRI scanners for cars.” The “damage discovery tool” uses cameras to capture images of the car and compares them to a model that’s apparently been trained on nothing but showroom cars. Redditors who’ve had the displeasure of being subjected to this thing report being charged egregiously high damage fees for non-existent damage. To add insult to injury, if renters want to appeal those charges, they have to argue with a chatbot first, one that offers no path to speaking with a human. While this is likely to be quite a tidy profit center for Hertz, their customers still have a vote here, and backlash will likely lead the company to adjust the model to be a bit more lenient, if not outright scrapping the system.

Have you ever picked up a flashlight and tried to shine it through your hand? You probably have; it’s just a thing you do, like the “double tap” every time you pick up a power drill. We’ve yet to find a flashlight bright enough to sufficiently outline the bones in our palm, although we’ve had some luck looking through the flesh of our fingers. While that’s pretty cool, it’s quite a bit different from shining a light directly through a human head, which was recently accomplished for the first time at the University of Glasgow. The researchers blasted a powerful pulsed laser against the skull of a volunteer with “fair skin and no hair” and managed to pick up a few photons on the other side, despite an attenuation factor of about 1018. We haven’t read the paper yet, so it’s unclear if the researchers controlled for the possibility of the flesh on the volunteer’s skull acting like a light pipe and conducting the light around the skull rather than through it, but if the laser did indeed penetrate the skull and everything within it, it’s pretty cool. Why would you do this, especially when we already have powerful light sources that can easily penetrate the skull and create exquisitely detailed images of the internal structures? Why the hell wouldn’t you?!

TIG welding aluminum is a tough process to master, and just getting to the point where you’ve got a weld you’re not too embarrassed of would be so much easier if you could just watch someone who knows what they’re doing. That’s a tall order, though, as the work area is literally a tiny pool of molten metal no more than a centimeter in diameter that’s bathed in an ultra-bright arc that’s throwing off cornea-destroying UV light. Luckily, Aaron over at 6061.com on YouTube has a fantastic new video featuring up-close and personal shots of him welding up some aluminum coupons. He captured them with a Helios high-speed welding camera, and the detail is fantastic. You can watch the weld pool forming and see the cleaning action of the AC waveform clearly. The shots make it clear exactly where and when you should dip your filler rod into the pool, the effect of moving the torch smoothly and evenly, and how contaminants can find their way into your welds. The shots make it clear what a dynamic environment the weld pool is, and why it’s so hard to control.

And finally, the title may be provocative, but “The Sensual Wrench” is a must-see video for anyone even remotely interested in tools. It’s from the New Mind channel on YouTube, and it covers the complete history of wrenches. Our biggest surprise was learning how relatively recent an invention the wrench is; it didn’t really make an appearance in anything like its modern form until the 1800s. The video covers everything from the first adjustable wrenches, including the classic “monkey” and “Crescent” patterns, through socket wrenches with all their various elaborations, right through to impact wrenches. Check it out and get you ugga-dugga on.

Flopped Humane “AI Pin” Gets an Experimental SDK

19 Junio 2025 at 11:00

The Humane AI Pin was ambitious, expensive, and failed to captivate people between its launch and shutdown shortly after. While the units do contain some interesting elements like the embedded projector, it’s all locked down tight, and the cloud services that tie it all together no longer exist. The devices technically still work, they just can’t do much of anything.

The Humane AI Pin had some bold ideas, like an embedded projector. (Image credit: Humane)

Since then, developers like [Adam Gastineau] have been hard at work turning the device into an experimental development platform: PenumbraOS, which provides a means to allow “untrusted” applications to perform privileged operations.

As announced earlier this month on social media, the experimental SDK lets developers treat the pin as a mostly normal Android device, with the addition of a modular, user-facing assistant app called MABL. [Adam] stresses that this is all highly experimental and has a way to go before it is useful in a user-facing sort of way, but there is absolutely a workable architecture.

When the Humane AI Pin launched, it aimed to compete with smartphones but failed to impress much of anyone. As a result, things folded in record time. Humane’s founders took jobs at HP and buyers were left with expensive paperweights due to the highly restrictive design.

Thankfully, a load of reverse engineering has laid the path to getting some new life out of these ambitious devices. The project could sure use help from anyone willing to pitch in, so if that’s up your alley be sure to join the project; you’ll be in good company.

Butternut AI

Por: EasyWithAI
3 Abril 2023 at 15:38
Butternut AI helps you create a complete, fully functional website in seconds without any coding required. You can customize your website to suit your brand with ease and get automatic SEO optimization to rank on top of Google search. Butternut AI’s intuitive platform allows anyone to become a website developer, just enter your business name […]

Source

Ellie

Por: EasyWithAI
22 Diciembre 2022 at 13:55
Ellie is an AI email assistant that helps you craft replies in your own writing style. The AI algorithm takes context from your previous email threads and is able to understand and respond in any language. It’s currently available as a Chrome or Firefox extension with Gmail support, but it plans to support other web-based […]

Source

Stable Artisan

Por: EasyWithAI
23 Mayo 2024 at 11:26
Stable Artisan brings the power of Stability AI’s generative models like Stable Diffusion 3.0 and Stable Video Diffusion together. Both models are now available to access on the official Stable Diffusion Discord server and can be interacted with using commands and prompts, much like Midjourney. Stable Artisan also offers a suite of editing tools like […]

Source

Contlo.ai

Por: EasyWithAI
1 Febrero 2023 at 15:05
Contlo.ai is an all-in-one AI marketing platform. With a conversational UI, you can manage all your marketing needs through a single chat interface. The tool offers end-to-end campaign management, plain English customer segmentation, predictive analytics, social media management, and SEO-optimized content creation.

Source

Reminisce.ai

Por: EasyWithAI
25 Agosto 2023 at 12:17
Reminisce.ai is an AI-powered online learning platform that makes it easy and fun to build technology skills and career paths. It uses cheat sheets, quizzes, and games to help you learn IT skills like Kubernetes, React, and AWS. With personalized career coaching, you can develop the right skills for roles like AI Engineer, Blockchain Developer, […]

Source

Exploring Generative AI in Photoshop

Por: EasyWithAI
19 Septiembre 2023 at 14:45
Category – Adobe Photoshop, Generative AI Course Difficulty – Easy Course Length – 27 Minutes Price – Requires Skillshare Subscription Rating  4/5 View Course This comprehensive course is designed for beginners who want to discover the incredible capabilities of generative AI within Adobe Photoshop. You’ll learn to harness the power of generative fill functions […]

Source

LongShot

Por: EasyWithAI
10 Diciembre 2022 at 22:24
LongShot is a comprehensive tool designed not only for generating high-quality, factually accurate content but also for optimizing it using advanced features. This platform stands out by incorporating real-time information into content creation, ensuring relevance and accuracy. Key features include Semantic SEO, fact-checking with citations, AI Interlinking , Humanizing AI and Plagiarism Checker. Furthermore, LongShot […]

Source

Artchan AI

Por: EasyWithAI
24 Julio 2023 at 04:57
Artchan is an new AI-powered image generator that makes creating art simple and accessible to everyone. Artchan specializes in creating anime and fantasy artwork with simple prompts and high quality results. Artchan also has a rapidly growing community of artists sharing their work. You can use their work as inspiration and clone their prompt to […]

Source

Second Nature AI

Por: EasyWithAI
1 Marzo 2023 at 16:05
Second Nature offers an AI-based conversational sales training software that is designed to improve your marketing and sales skills. The platform lets you practice any type of sales conversation to help train you or your teams communication and marketing efforts. Second Nature AI provides a “virtual pitch partner” that uses conversational AI to have actual […]

Source

Octane AI

Por: EasyWithAI
6 Enero 2023 at 18:45
Octane AI is an ecommerce tool tailored towards Shopify store owners for improving their sales and marketing efforts through the use of quizzes, surveys, and product recommendations. It allows companies to gather feedback, find the right products for their customers, and increase revenue through personalized experiences. Jones Road, one of the fastest growing Shopify brands, […]

Source

DreamHouse AI

Por: EasyWithAI
8 Febrero 2023 at 15:20
DreamHouse AI is an interior design app that uses AI to generate virtual interior designs. You can upload a photo of your room, and the app will generate professionally designed interiors in minutes. You can experiment with different perspectives and angles to get the best results. The Inspiration mode allows you to get creative interior […]

Source

Diffusion Art

Por: EasyWithAI
31 Marzo 2023 at 14:35
Diffusion Art is a free web-based art generator. Unlike MidJourney, there’s no need for Discord and no login required. It’s also completely anonymous, keeping your generated art private and not shared with a Discord server! This AI art generator also features a built-in advanced prompt generator and tuner. Diffusion Art also comes with a variety […]

Source

Cleanvoice

Por: EasyWithAI
19 Diciembre 2022 at 17:05
Cleanvoice is an AI voice tool that improves the quality of your audio recordings by removing filler sounds, stuttering, and mouth noises. It is capable of detecting and removing these issues in multiple languages, including those with heavy accents. Cleanvoice can also identify and remove long periods of silence (dead air) in order to keep […]

Source

❌
❌