Web spiders are software agents that traverse the Internet gathering, filtering, and potentially aggregating information for a user. This article shows you how to build spiders and scrapers for Linux to crawl a Web site and gather information, stock data, in this case. Using common scripting languages and their collection of Web modules, you can easily develop Web spiders.
AstLinux also runs on mini-ITX computers and ordinary PC hardware, so you can scale your hardware according to your needs. Because you're starting from a minimalistic installation, you have the pleasant option of adding features as you need them, rather than pruning away cruft. You might look at something like the Soekris net4801 and laugh. "Haha!" you say, "I have more computing power in my wristwatch!" Which may be true. If so, it means AstLinux can be ported to your wristwatch for the all-time great conversation piece. But don't underestimate these little boards. They are tough as nails and tolerate conditions that kill off ordinary PC hardware.
According to Forrester Research, open source databases can lower an enterprise's database total cost of ownership (TCO) by fifty percent. This finding and others were presented in an online seminar delivered on Oct. 12, 2006 titled "Realizing the Value of Open Source Databases: How Sony Online Entertainment Replaced Oracle with EnterpriseDB."
IBM PLACESadmin helps developers to create, modify, and manage indoor, location-based Web applications through a simple-to-use, Internet-based Web interface. The term PLACES stands for Point-of-interest, Locations, and Assest Catalog for Enterprise Services.
The "Slug" has gained much-improved Linux support, thanks to a new Debian installer that targets the device. The first release candidate of the debian-installer for Debian's forthcoming "Etch" distribution was released yesterday, offering nearly complete support to the Linksys $99 NSLU2 NAS gadget.
Sun Microsystems move on Monday to make Java open source code under the General Public License will not deter an Apache Software Foundation project that promised an open source Java implementation on its own. The Harmony project, started May 18, 2005, moved out of its incubation phase and into full-fledged project status at the end of October. Moving out of incubation is a sign within the Apache Software Foundation that an open source project is organized and has a critical mass of contributors. It's expecting to produce its 1.0 implementation of Java Standard Edition in mid-2007.
Casual Ubuntu users may have registered surprise when they first booted the distribution's Edgy Eft release this past October. Back at the beginning of the Edgy development cycle, much was made of the formation of a new, dedicated Art Team to develop a fresh look for the backgrounds and splash screens of the startup process. But when Edgy hit the shelves, the artwork was scarcely different from that of its predecessor, Dapper Drake.
On November 2, 2006 the embargo for Intel's Core 2 Extreme Quad QX6700 was lifted which resulted in a slurry of reviews covering this flagship desktop processor. However, this morning happens to be an important date for Supercomputing 2006 and it serves as yet another milestone for Intel Corporation. This morning Intel will be introducing the Xeon 5300 series, or perhaps better known by its codename of Clovertown. At Phoronix we have had these processors in-house for over a week now and today are able to share our thoughts on these quad-core server/workstation processors as we test them under GNU/Linux.
A new preliminary version of the Slackware-based STUX GNU/Linux live CD, version 0.9.2, was released Monday, featuring a 2.6.17 kernel and the KDE desktop environment. STUX is touted as a live CD distro that can automatically load and save main configuration and personal files on a writable partition.
Open Shakespeare is a UK-based project to publish a fully open edition of Shakespeare's works. It's not trying to be another "Shakespeare on the Web," just providing an HTMLized copy of the plays and works. Its goal is to produce a reusable package of free and open material, including the main source texts, encodings in various open formats including XML and PDF, ancillary material, a Python API, and other documentation and tools.
As we prepared to open a new Freedom Technology Center in a rehabilitated site in New Jersey, I came to learn that Verizon was capable of offering fiber service at our location. Officially, they only claim to support those using Microsoft Windows and Mac OS/X with their service. In fact, with a little foreknowledge, you can have installed, activated, and use your FiOS service with an entirely free operating system such as GNU/Linux.
The city of Vienna has migrated more than 100 servers to Red Hat Enterprise Linux, principally deploying Red Hat Enterprise Linux ES and AS on HP hardware.
The free Java community reacted positively, but cautiously, to the news that Sun Microsystems has released the code for Sun Java under the GNU General Public License. While community leaders showed appreciation of the news by cooperating in the announcement, developers in the free Java community reacted more tentatively, and at least some projects seem likely to continue development of their own implementations of Java.
Red Hat, a leading provider of open source, will introduce its newly integrated JBoss solutions to its Middle Eastern partners and detail the company's plan for partner activities in the region at the Gitex 2006. At the event will be held at the Dubai International Convention and Exhibition Centre from November 18 to 22.
The RIAA is a true champion of fair use. That's what RIAA president Cary Sherman wants you to believe. In an op-ed piece published by Cnet, Sherman champions the RIAA's unique understanding of fair use while taking digs at those who do not share the record industry's vision—like the Consumer Electronics Association.
See the very first broadcast of the IBM TV network. Joe Washington, host of the HGTV's Ground Breakers, does a masterful job of explaining what IBM TV is all about.
This month's tasty installment explores the world of Roll-Your-Own Linuxes. Why would you want to assemble your own customized Linux image? Well, why not? For one thing, it's just plain fun. For another, despite the fact that there are hundreds of existing Linux distributions of all shapes and sizes, you might still find yourself wanting something that doesn't exist.
Underscoring the supercomputing market's move toward clusters, the Lawrence Livermore National Laboratory just got the first of four Linux-based clusters that researchers plan to put to work doing climate studies, astrophysics and tracking the lifespan of the country's nuclear weapon stockpile.
It was a split decision this year. Both Nancy Anthracite and Will Ross are recipients of the 2006 Linux Medical News Freeodm award, co-sponsored with the International Medical Informatics Association. Ross and Anthracite have worked tirelessly to advance the cause of software freedoms in medicine: Anthracite through many activities through the VistA community and Ross through his work with Mendocino County Health Records Exchange and grant funding of important FOSS development.
Linux Networx today announced the next member of the LS Series, Performance Tuned (LS-P) Linux Supersystems. The LS-P Series are designed as turnkey, production-ready systems for leading product design applications. LS-P systems are performance tuned for computational fluid dynamics (CFD), crash/impact analysis and structural analysis applications. Visualization software from CEI is supported as an integrated application on all systems.