Wikisource talk:WS Export

From Wikisource
(Redirected from Wikisource talk:WSexport)
Jump to navigation Jump to search

Suggestions for improvement[edit]

Localization of link[edit]

The link added should be properly localized. As it is now, the link appears in one only language irrespectively of the interface language selected in the Users preferences. Compare with the link for exporting a PDF version, which changes if the user changes their language.

Parameters for flexibility[edit]

Hi! Thank you for this great tool. I have been considering how we can use it to the best effect at English Wikisource, and I have some suggestions that would allow us to integrate the book-generation process with the existing PediaPress PDF tool, as well as allowing more unconventional collections of pages.

Firstly, the PediaPress tool operates on a simple list of pages, like this: en:Wikisource:Books/Bull-dog Drummond. We could use ws-summary and metadata microformats on this page to generate the book, but it will include the raw pages list, as well as the formatted contents contained in the work itself. Thus, my first suggestion is to accept a parameter "ignorelistpage" which will direct WSexport to harvest the links in the list, but not include the page in the final document.

This manual (but with a page-collector helper) listing method allow us to have separate works group together (for example, a collection of related scientific papers), allows works that are not all linked to a single front page (like newspapers) and allows us to have a curated collection of verified (and editable) books.

Secondly, if the metadata is included on the list page, that allows flexibility in unusual works (such as a compilation of separate works by many authors). However, for the common case where the information on a single page is sufficient, this duplication of data invites mistakes to creep in when pages are updated, but not the related book. Therefore, I propose a "metadatalocation" parameter, which directs WSexport to harvest metadata from another page.

I am working on a way to integrate this into a system more targeted at WSexport than the PediaPress extension, but I would need to know exactly what format the script would take the parameters before it could be completed. Thank you for your great work! Inductiveload 03:06, 20 February 2012 (UTC)[reply]

For the ignorelistpage, no problem, it's a good idea ! I'll implement it. For the metadatalocation it's a little bit more complicated but I think it's possible. Thanks for your interest. Tpt 20:32, 21 February 2012 (UTC)[reply]
Wonderful! I feel the ignorelistpage is the most important, as that means any pagelist that is not part of the work itself can be used easily. I have talked to the mwlib/PediaPress list, and it is unlikely that we can get enough "care" to have the Wikisource-specific metadata thing implemented any time soon (sounds like they have a lot of problems, and cate ring to Wikisource is not one of them), so I might try to knock a client-side one up myself. If you can't get an external metadata collection scheme working, it's not a huge crisis, it just means we'll need to keep an eye on diverging metadata. Thanks for the quick reply, I'll keep you informed of any useful progress I make on a book generator.
One thing that was mentioned in the list was the Dublin Core metadata system. Can you see any milage in the DC system, as compared to (or in addition to) the current WS microformat? Just food for thought. Cheers, Inductiveload 19:05, 22 February 2012 (UTC)[reply]
The current WS microformat is inspired by w:en:Dublin Core that is not adapted to html content. The xhtml ouput of WSexport add dublin core metadata from the microformat. I'm working on a metadata system build with proofreadPage that will provide an api and add DublinCore in header of Wikisource pages. Here is a quick description of the the keys problems of the project. Tpt 21:50, 26 February 2012 (UTC)[reply]

Implmenting in Wikibooks[edit]

Hi,

I'm very interested in using this tool on wikibooks. At the moment the pdf exporter there is pretty poor and it would be great to get an epub exporter in place. Do you have any advice as to how this could be done. The book I want to convert is here Pluke (talk) 11:39, 20 January 2013 (UTC)[reply]

Wsexport is currently design to work on Wikisource, so this tool is adapted to Wikisource specificity and some changes in its code is needed to make it work well for Wikibooks. It's not an impossible thing but I'll do it only if there is a strong request of the Wikibooks community. Tpt (talk) 20:01, 20 January 2013 (UTC)[reply]
Hi Tpt, I'd definitely have a use for it, I need to export this book:, I'll see if we can get a few more people interested. The current PDF export feature is pretty poor. Pluke (talk) 19:35, 22 January 2013 (UTC)[reply]

Running from command line[edit]

I've just submitted a pull request for some little variation to the way in which paths are coded for the CLI bit of the WSexport tool. I'm working on a little thing that will let me keep my local ePub library up-to-date with Wikisource (updating books as they cange on WS). Does such a thing exist anywhere, does anyone know? Would it be of interest to anyone? And many thanks to tpt for a great tool!  :-) Samwilson (talk) 05:12, 19 March 2013 (UTC)[reply]

Implementing it for Wikipedia[edit]

@User:Tpt., Thanks for this great tool, which is being used on Telugu wikisource. I find a need for this tool to work on Telugu wikipedia, where we would like to produce books on collection of featured articles or pages under a project. The Pediapress tool has bugs in rendering front cover for Telugu language and though the bug was reported, nothing much happened from Pediapress or WMF over the past few years. I also feel EPUB is a better for Telugu, as it allows search capability. Can you share your thoughts on how this can be expanded for use on Wikipedia and any tips for doing this, incase you are not able to spare your time soon.--Arjunaraoc (talk) 04:48, 11 May 2015 (UTC)[reply]

Research on the potential of Wikimedia content in EPUBS[edit]

Hello Tpt, we are researchers from the Publishing Lab in Amsterdam. As part of our research we did some tests on certain tools that are used to export Wikimedia content to EPUBS and we were very happy to find your tool to be very effective. (Researching Existing Tools)

Here is the documentation of our ongoing research on Meta-Wiki: Research on e-books with Wikimedia content.

We would like to work on a tool that allows users to gather content from multiple Wikimedia projects and collect them in an EPUB. We are still at the beginning of the journey, so the specifics are not clear yet, but it would be great to have your input. We have a few questions about the tool.

Have you ever considered or worked on the possibility to develop WSexport further to collect content from multiple wikis and not just individual wiki projects? Did you encounter any difficulties?

We noticed that you worked on another instance of the WSexport that is browser based (http://wsexport.wmflabs.org currently not available?). We were curious to know the differences between the two and what you find useful about having a browser based version.

Thanks for your time. Cristina Juan

13zhampu13 (talk) 10:40, 3 October 2016 (UTC)[reply]

WSexport/Scunthorpe problem[edit]

I suggest renaming this page slightly (i.e. separating the S and “export”) so that overzealous web filters don’t block it when they notice the “Sex” in the name. JohnSmith13345 (talk) 10:02, 13 June 2019 (UTC)[reply]

Would WsExport do? --Zyephyrus (talk) 07:54, 14 June 2019 (UTC)[reply]
@Zyephyrus, JohnSmith13345: I think something more descriptive might be good, such as Wikisource Export Tool. I suggest the same change to the tool: https://github.com/wsexport/tool/pull/202Sam Wilson 00:28, 5 September 2019 (UTC)[reply]

Request : Book cover design[edit]

Can you update this tool to add an option for excluding Book cover image from the output file. This photo is not suitable for all books titles. Omda4wady (talk) 12:56, 20 September 2020 (UTC)[reply]

@Omda4wady: Sorry for my slow reply. Do you mean an option to only exclude the cover image, but still include all other images? Or does the existing option to exclude images work okay? —Sam Wilson 05:52, 23 November 2020 (UTC)[reply]

Error emails[edit]

@Samwilson: Hey there, haven't followed the development of the tool much but it looks like things are moving faster, which is great. I'm still listed as maintainer and recently started getting error notification emails: "[Wikisource Export TEST] Uncaught PHP Exception Symfony\Component". Where is this configured? – Jberkel (talk) 12:06, 18 November 2020 (UTC)[reply]

@Jberkel: sorry for my slow reply! These emails go to all maintainers, as configured in https://toolsadmin.wikimedia.org/tools/id/wsexport . It's slightly misleading, in that we're using the wsexport maintainers list for the wsexport-test instance as well, just to make it easier than maintaining them separately. Hopefully the number of emails will decrease as we improve the tool! —Sam Wilson 05:51, 23 November 2020 (UTC)[reply]
@Samwilson: Ah ok, so not production errors? What's the test instance? Used for staging? – Jberkel (talk) 08:41, 23 November 2020 (UTC)[reply]
@Jberkel: No, there are lots of production errors too. :-/ There aren't actually any new errors, it's just that we've made the reporting system much more annoying! :) And yep, the test instance is the staging server; it's at https://wsexport-test.wmflabs.org . —Sam Wilson 09:47, 23 November 2020 (UTC)[reply]

Broken wsexport[edit]

The export function does not seem to work properly, chapters are not exported as part of the epubs or pdfs, rather only the table of contents is included and the about page E7A986E4BA91 (talk) 01:59, 2 March 2021 (UTC)[reply]

@E7A986E4BA91: Thanks for reporting this. Which book are you getting this error with? —Sam Wilson 02:23, 2 March 2021 (UTC)[reply]
@Samwilson: Many pages, for example https://zh.wikisource.org/zh/西遊記 E7A986E4BA91 (talk) 02:47, 3 March 2021 (UTC)[reply]
@E7A986E4BA91: Thanks! It looks like this has been reported as Phabricator:T275967. We'll follow up there soon. — Sam Wilson 02:57, 3 March 2021 (UTC)[reply]

Apostrophe in the title[edit]

It seems that the WS Export does not work if there is an apostrophe in the title of the book.

See this example: La Costa d'Avorio, if you try to clink on the PDF link at the top of the page, it does not work.

Can someone check and solve this issue? Thanks, regards, --Accurimbono (talk) 10:24, 2 December 2021 (UTC)[reply]

@Accurimbono: It’s not a bug in the tool itself, but the template {{Intestazione}}—the blue Scarica button next to the page title works perfectly. I don’t know how the template works, but you should basically prevent it from HTML-encoding the page title. —Tacsipacsi (talk) 17:04, 2 December 2021 (UTC)[reply]

Maybe someone could bring it up again. --Aschroet (talk) 09:42, 18 April 2023 (UTC)[reply]

@Aschroet: Rebooting it now. Sam Wilson 09:45, 18 April 2023 (UTC)[reply]
@Aschroet: Done. It is up again. Sorry for the downtime! Sam Wilson 09:50, 18 April 2023 (UTC)[reply]

Thank you for the quick response, --Aschroet (talk) 09:56, 18 April 2023 (UTC)[reply]

It's down.[edit]

And for a while now as well. Ștefan Tărâță (talk) 14:24, 27 April 2023 (UTC)[reply]

@Ștefan Tărâță:It's certainly been down lately more than usual, but it looks like it's back up now. Generally, this is due to excessive bot traffic. I've created phab:T335553 to look at what's going on. Sam Wilson 01:29, 28 April 2023 (UTC)[reply]

Seems to be down again.[edit]

The https://ws-export.wmcloud.org website seems to be struggling lately... Many thanks in advance for bringing it back again ! Epigeneticist (talk) 20:18, 12 June 2023 (UTC)[reply]

😢 At this point, is it ever gonna work again? Eugrus (talk) 19:54, 19 February 2024 (UTC)[reply]