Linux News
The world is talking about GNU/Linux and Free/Open Source Software

Login

If you don't have an account yet, visit the registration page to sign up.

If you already have an account, you may login here:

Today's Big Story

radion - internet radio TUI client

LXer Features

My Linux Laptop

Laptop Dual Boot Project: Part 2

Laptop Dual Boot Project

Lenovo Laptop Love..Not!

Attempting to install Linux on a new laptop, a follow-up

Attempting to install Linux on a new laptop

Have something to say?

Ready to be published? LXer is read by around 350,000 individuals each month, and is an excellent place for you to publish your ideas, thoughts, reviews, complaints, etc. Do you have something to say to the Linux community?

Publish it here.

DaniWeb Linux Community
An exciting professional discussion group about software development, php, shell scripting, networking, ruby, and more.

Latest Discussions

Wrong headline

Uhhh

this time, it's an ad

Calamares 3.3.5 vs Ubuntu 24.04 Desktop installer

Virt-manager vs Cockpit Web Console on Fedoras 40 Beta,39,38 and other Linux Flavors

KDE Plasma 6.0.2 on Manjaro Testing branch

Wonderfull app

How to contact editorial board of Lxer.com in private ?

LOL

Install Python 3.12.2 on ManjaroKDE 23.1.3 Stable branch

More...

Site Menu

Other News

- LWN.net
Their weekly coverage of Linux news is unmatched in this community.

- LinuxGizmos.com
Excellent news for embedded Linux.

- LinuxQuestions.org
Discussion forums for Linux users.

LinuxQuestions.org is a friendly and active Linux Community with forums, reviews, a hardware compatibility list, a wiki, tutorials, a download site, a podcast and more.

Mass processing .doc files

Forum: Linux

Total Replies: 7

Author	Content
techiem2 Feb 07, 2008 12:43 PM EDT	Ok, so apparently we need to make the syllabi available to the students. Which means we want them in pdf (easy enough). The Problem is, they contain a Course Outline section that's basically an anticipated plan for the class schedule, which of course the students aren't supposed to have. So what I need to do is, read the word file, process OUT that section (wherever it may be - there isn't a standard layout), and dump the "fixed" version to pdf (without losing the formatting). At one point my boss had a word macro to do it, but that's been long since lost. I assume there's a way to make OO.o do this, but I have no experience whatsoever with macros and such. Any pointers? Ideas? Other methods? Thanks guys and gals as always! (as an aside, my sign control system is coming along nicely, I'm at the point of figuring out how to control mplayer with it - the button to force conversion of a file doesn't work [the script starts then dies soon after], but that process could be cron'd anyway, so it's not vital if I can't get it working from my php interface) Mark II
Sander_Marechal Feb 07, 2008 1:01 PM EDT	Try running OOo in headless mode on a server and script it from the outside using Python. Converting to PDF is really easy. Take a look at Mirko Nasato's PyODConverter: http://www.artofsolving.com/opensource/pyodconverter Try converting that script so it will strip the required section before converting to PDF. Then write up a cron job that watches some directory for .doc files and converts anything that it finds into PDF. Should't be that hard. I highly recommend subscribinng to OO's dev mailinglist at api.openoffice.org for help.
ColonelPanik Feb 07, 2008 3:11 PM EDT	Do it in Moodle
dinotrac Feb 07, 2008 3:47 PM EDT	Sander - Yup. Headless OOo is an amazing thing -- an amazing thing that most people and businesses know nothing about. I know at least one company that paid tons of money for a fancy document server that , in their use, did nothing more than running through OOo would have done. Dopes.
azerthoth Feb 07, 2008 4:20 PM EDT	I havent played with it, but I do remember when playing with samba configs seeing an example path to .pdf printer .. ie feed in one and get out a .pdf.
techiem2 Feb 07, 2008 10:30 PM EDT	Yeah, I have cups-pdf setup. The main thing I'm trying to figure out is how to automatically process the silly docs to rip out what's not wanted... I'll start by takeinga look at the python and ooo headless stuff when I have some time to work on it. Could be useful.
dinotrac Feb 07, 2008 11:43 PM EDT	If you've got doc files, you can do it a couple of ways. I think you can get headless OO to do OO's own scripting, but I've never tried that. You can pretty easily convert the .doc to ODF. From there you can do amazing things with stylesheets and/or scripts, because it's all xml.
Sander_Marechal Feb 08, 2008 12:38 AM EDT	I'm pretty sure that it's possible using Python. I've adepted that PyODConverter script myself to do things like updating the table of contents. The biggest problem is that OOo's API is huge and the docs are all geared to Java, not Python.

You cannot post until you login.

Linux NewsThe world is talking about GNU/Linux and Free/Open Source Software