-
Notifications
You must be signed in to change notification settings - Fork 20
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Failed to get the article :( #19
Comments
Additionally, the link to the "Original source (on modern site)" is broken. The "Original source" URL has the same wrapper as the front page link, which sends it back to article.php (which tries to send it through "readability" again), and, of course, it fails again. As a work-around you can edit it in your browser's URL bar to remove the wrapper (which looks like this: "http://68k.news/article.php?loc=US&a=") and reload. I've been doing this for a few weeks now (it's getting a little old). Now that I've found the code, I'm gonna try to create a fix. |
Have been running into this too awhile. |
Makes me wonder if News Sites are against this because they love their freakin' DRM/anti-adblock. I have a feeling they are blocking 68k.news and FrogFind on purpose. |
I'm having the same issue :( |
Found the issue. |
Best route is to bypass google news by manually decoding url. |
I've tried looking into this a few times, it seems like Google has made some changes over time that breaks the current implementation, with no real clean way to implement a fix that I can see. In short;
There are some existing code snippets out there (e.g. this one) that are successfully able to get the decoded URL, but as this is dependent on an internal-only "API" and seems to be rather actively rate limited too, I fear that attempting to implement this is at best just going to give us a very short-lived success story. Is there perhaps any other source of news that can be considered that has an API or at least a URL structure that isn't changing as actively as Google News? It just seems like Google is very actively attempting to prevent outside parties from scraping or otherwise using them as a source. |
Not to contradict what anyone has said, and admittedly I don't know the code very well, but I want to point out that 68k IS getting the correct URL from GN. When we get the error page (article.php) it has an embedded link to the "Original page", and both that link and the one currently in the URL bar at that time contain the correct URL, but it is prefixed with "http://68k.news/article.php?loc=US&a=". I have been systematically editing the URL to remove that prefix and then the original page loads perfectly (it's a pain but it works and it's the only way 68k is currently usable to me). I looked at article.php and it is sending that same URL (ie the same variable) both to the "reader mode" module and the "Original page" link. I have flirted with the idea of writing a hack that removes the prefix (if present), and I think that would make things work again. However, I want to research the source of the prefix+URL before doing anything and just haven't found the time. |
I should add that what I just said suggests the issue is a lot simpler than some are making it out to be (i.e. it's not google; the added prefix clearly comes from 68k). |
What you're referring to here is the Google News link, not the original URL. This, as long as you open it in a browser with JavaScript enabled, redirects to the real article URL. It's that last link that 68k news needs as it tries to parse the original article and show it in a text-only way for old machines/browsers, just like its main list view.
What you're describing here is simply visiting that original page directly, which on a modern (enough) browser will work fine of course. But for those that visit 68k news on vintage hardware or simply want a text-only experience, this won't do what they're looking for.
The prefix, as you call it, is a necessary part of how 68k news works, as it (originally, before Google News changed things up) would render a text-only version of the original article contents. Without this people on vintage hardware/browsers can only reliably view the 68k homepage that lists out links to articles, but not be able to actually read each article — unless their browser does support whatever tech stack the particular website in question uses, of course. Now that Google has changed things in a way that there is no obvious solution (yet) on how to make it work as it used to again, it sadly just shows a failed message. I think a useful addition here would be to have the link to the Google News article URL right there alongside the error message, but the bigger issue of course is that we'd like 68k news to go back to being able to actually fetch, parse, and render these articles again. |
I think it is possible to reverse engineer the "internal" API. Either that or we have to move away from Google News. |
Now, I figured out a possible alternative: fresh rss |
or tiny tiny rss. |
You're right. Sorry for the confusion. I read the source too hastily. It would be helpful to remove the article.php wrapper from the "Original source" link, at least in the interim. It won't help folks on limited devices, but it will help those of us who use 68k just to avoid the bloat that google adds. Incidentally, FWIW, I tried loading a google news link in dev tools and noticed a couple messages referring to ad insertion. I hadn't noticed inserted ads before, but I run uBlock Origin. I retested with it disabled and a giant banner (possibly video, I don't recall) appeared at the top of the page. Anyway, from this, I can speculate that one of the reasons (maybe the main one?) behind their recent changes are to protect their ad empire (like recent actions with youtube). |
also, News publishers might see this site as "one giant ad-block" even though it is made for old technology and made Google disable it. If Frogfind uses DuckDuckGo, 68k.news could possibly use duckduckgo news(If DDG News existed). |
I have a temporary solution. Move from Google News to an aggregator hosted on GH Pages. Paid Hosting/Self-hosted is a more permanent solution |
Attempting to open any Article will result in Failed to get the article :(
Sorry - working on it! Invalid or incomplete HTML.
The text was updated successfully, but these errors were encountered: