Wikisource:Unify transcription namespaces

From Wikisource
Jump to navigation Jump to search

The "Index" and "Page" namespaces are usually used together. This page will suggest methods of unifying them, giving benefits and costs, and will explore the disadvantages.

Benefits[edit]

  • One namespace means less for new users to learn and understand transcription projects on Wikisource.
  • If the djvu "Index" page and its subpages are in the same namespace, all subpages can be moved when the Index page is moved.
  • Stop using the namespace "Index", which means catalogue in English when used in a library context.

Data migration[edit]

Direct data manipulation[edit]

The database schema contains a 'namespace' for every page. With a single line of SQL, all "Index" and "Page" pages can be changed so that those pages become attached to a different namespace.

Bot[edit]

The pywikipedia bot framework has the ability to "rename" all pages to the desired namespace.

Proofread Page extension[edit]

In order for the Proofread Page extension to work with only a single unified namespace, the code will need to be changed.

Djvu files[edit]

For djvu files, only a few lines of code need to be changed.

Series of images[edit]

In order for this proposal to work for books comprised of separate images, a little more logic is required. On page creation, the extension will need follow this order:

  1. if there is an image with the same pagename, the side-by-side view should be shown.
  2. show the pretty "edit index" page.

Backwards compatibility[edit]

Todo : What level of backwards compatibility would be desirable.

Proposed namespaces[edit]

These sections describe the methods which could be used to unify the two namespaces.

Image[edit]

One method of unifying the namespaces is to develop a more specialised handler for the "Image" namespace.

Migration[edit]

A book that is a djvu would be changed like this:

Index:Example.djvu   ->  Image:Example.djvu
Page:Example.djvu/1  ->  Image:Example.djvu/1
Page:Example.djvu/2  ->  Image:Example.djvu/2
Page:Example.djvu/3  ->  Image:Example.djvu/3
..etc

A book that is comprised of separate images would be changed like this:

Index:Example        ->  Image:Example
Page:Example p1.png  ->  Image:Example p1.png
Page:Example p2.png  ->  Image:Example p2.png
Page:Example p3.png  ->  Image:Example p3.png
..etc

Implementation[edit]

In this proposal, the purpose of the Image namespace changes in the following ways:

Image:Example.djvu[edit]

Currently these pages display a thumbnail of the first pagescan, and we add {{Information}} on the Commons page, and that appears on Wikisource. In this idea, if there is no local page, the top of the page would be a thumbnail on the left, and <pagelist/> on the right, and underneath the Commons page would be displayed. However, we could edit this page, and change the presentation if we want to. Our "Index" pages usually have very similar information as the Commons image page.

Image:Example.djvu/1[edit]

This page isnt a legal image name. The upload form will prevent a file from being uploaded with this name, so there is no harm in putting a normal text page there. I propose that we put the side-by-side view of a djvu page at this name.

Image:Example[edit]

This is also not a legal image name. The upload form will prevent a file from being uploaded with this name, so there is no harm in putting an index of the pagescans at this name. It would look like:

<nowiki>
 {{Index
 | title=Book title
 | pubyear=1901
 | author=Old dead person
 | pages=[[Image:Book title p1.png|p1]] [[Image:Book title p2.png|p2]] [[Image:Book title p3.png|p3]]
 | ...
 }}
<nowiki>

Image:Example p1.png[edit]

By default, image pages would look like the normal Commons page with a thumbnail. When the user clicks create, it would show the side-by-side edit view, allowing transcription. When the editor clicks save, the side by side-by-side view would be shown.

Benefits[edit]

The most important benefit that is that we eliminate special namespaces. There are a few ways this helps:

  • new wikis do not need to set up additional namespaces in order to implement Proofread Page, which means they can start using it immediately.
  • After a djvu or image is uploaded, the user is shown the Image page. Users on other wikis are used to an Image page, and are familiar with how to edit them.

Page[edit]

A "Page" namespace will require the least amount of work, and the indexes are fewer in number, and viewing the history of those is less important. The indexes in a "Page" namespace is not very strange.

Scanset[edit]

A "Scanset" or "Transcription" namespace accurately describes the contents of the pages found in that namespace.

Migration[edit]

A book that is a djvu would be changed like this:

Index:Example.djvu   ->  Scanset:Example.djvu
Page:Example.djvu/1  ->  Scanset:Example.djvu/1
Page:Example.djvu/2  ->  Scanset:Example.djvu/2
Page:Example.djvu/3  ->  Scanset:Example.djvu/3
..etc

A book that is comprised of separate images would be changed like this:

Index:Example        ->  Scanset:Example
Page:Example p1.png  ->  Scanset:Example p1.png
Page:Example p2.png  ->  Scanset:Example p2.png
Page:Example p3.png  ->  Scanset:Example p3.png
..etc