Difference between revisions of "User:Dominik.epple/DocumentConverterInstall"

Line 16: Line 16:
  
 
We consider a clustered setup with multiple middleware nodes and multiple converter nodes.
 
We consider a clustered setup with multiple middleware nodes and multiple converter nodes.
 +
 +
We consider a high available connection from the middleware nodes to the converter nodes realized using HAproxy instances running locally on the middleware nodes which do simple round-robin to the Apache instances of the Fronted nodes, which will do ''correct'' session stickyness aware routing to the converter nodes.
 +
 +
We need the Apache instances to get correct ''session sticky'' routing behavior.
 +
 +
We need the HAproxy instances to connect to the Apache instances in a high available fashion. If there are other means in the infrastructure offering that functionality, this is also okay. We present the HAproxy based setup to give an example of a fully working setup.
  
 
= Software installation and configuration =
 
= Software installation and configuration =
Line 95: Line 101:
  
 
Do a testing cycle as described below.
 
Do a testing cycle as described below.
 +
 +
If everything works, you confirmed that the middleware node and the converter node are configured correctly.
 +
 +
Now ensure all your middleware nodes and all your converter nodes are configured likewise. Make sure the converter nodes get unique <code>com.openexchange.server.backendRoute</code> values (see above).
 +
 +
= Second test: use Apache for loadbalancing =
 +
 +
Pick one frontend node to configure its Apache for loadbalancing for the converter nodes.
 +
 +
There are sample configuration stanzas in our default configuration. They are just fine. Just make sure the <code>route</code> parameters match the ones from the converter nodes.
 +
 +
I want to emphasize to configure the <code>Allow</code> line correctly. The <code>/documentconverterws</code> endpoint must not be made available publicly!
 +
 +
A sample Apache configuration looks like
 +
 +
<nowiki><Proxy balancer://oxcluster_docs>
 +
    Order Deny,Allow
 +
    Deny from all
 +
    # configure the allowed IPs such that only the middleware nodes are able to access
 +
    # the /documentconverterws endpoint must not be made available publicly!
 +
    Allow from 10.0.1
 +
    BalancerMember http://dc1:8008 timeout=100 smax=0 ttl=60 retry=60 loadfactor=50 keepalive=On route=DC1
 +
    BalancerMember http://dc2:8008 timeout=100 smax=0 ttl=60 retry=60 loadfactor=50 keepalive=On route=DC2
 +
    # add further converter nodes, as many as you have
 +
    ProxySet stickysession=JSESSIONID|jsessionid scolonpathdelim=On
 +
    SetEnv proxy-initial-not-pooled
 +
    SetEnv proxy-sendchunked
 +
</Proxy>
 +
 +
ProxyPass /documentconverterws balancer://oxcluster_docs/documentconverterws</nowiki>
 +
 +
With that configuration in place, test the connectivity from the middleware node to that endpoint, e.g. (assuming the frontend node is called <code>frontend1</code>):
 +
 +
<nowiki># curl http://frontend1/documentconverterws
 +
<html>
 +
<head><meta charset="UTF-8"><title>Open-Xchange DC</title></head>
 +
<body><h1 align="center">OX Software GmbH DC</h1>
 +
<p>WebService is running...</p>
 +
<p>Error Code: 0</p>
 +
<p>API: v5</p></body>
 +
</html></nowiki>
 +
 +
The response looks exactly like before. The difference is just that we access the service now via Apache.
 +
 +
Make sure you are actually getting responses from the different converter nodes by whatever means suit you (e.g. tcpdump on the converter nodes, looking at Apache's <code>balancer-manager</code> to verify the <code>elected</code> number increases equally for all converter nodes, etc.)
 +
 +
  
 
= Testing =
 
= Testing =

Revision as of 09:19, 16 February 2018

DocumentConverter Quickinstall Guide / Cheatsheet

Introduction

This document aims to be a condensed "HOWTO" like installation walkthrough.

The other relevant documentation is to be considered as a reference.

This document describes the fully clustered setup. Simpler setups can be deduced as special case.

This document has been created and verified on CentOS7 and OX App Suite 7.8.4.

Design description

We consider a clustered setup with multiple middleware nodes and multiple converter nodes.

We consider a high available connection from the middleware nodes to the converter nodes realized using HAproxy instances running locally on the middleware nodes which do simple round-robin to the Apache instances of the Fronted nodes, which will do correct session stickyness aware routing to the converter nodes.

We need the Apache instances to get correct session sticky routing behavior.

We need the HAproxy instances to connect to the Apache instances in a high available fashion. If there are other means in the infrastructure offering that functionality, this is also okay. We present the HAproxy based setup to give an example of a fully working setup.

Software installation and configuration

Middleware nodes

Prerequisites: Drive is installed and working. We need Drive to upload test documents and verify their conversion. This also assumes (since we a discussing a clustered setup) a clustered filestore is available and working. If you are only going for a single node proof-of-concept setup, that is not relevant of course.

Packages to be installed:

open-xchange-documentconverter-api open-xchange-documentconverter-client

Configuration:

/opt/open-xchange/etc/permissions.properties:
# assume a global switch on for testing. further config cascade stuff etc is out of scope
# of this document.
com.openexchange.capability.document_preview=true
documentconverter-client.properties:
# this needs to be adjusted. will be discussed below.
com.openexchange.documentconverter.client.remoteDocumentConverterUrl=http://host[:port]/documentconverterws

Frontend nodes

None

Converter nodes

Packages to be installed:

open-xchange-documentconverter-server open-xchange-documentconverter-api readerengine

(Note: the official documentation also mentions "pdf2svg" which at least on CentOS7 does not exist, but rather a package named "readerengine-pdf2svg" is pulled as dependency. So for the moment let's assume we don't need to install that explicitly, but if you are following this guide on Debian you should double-check.)

Configuration:

/opt/open-xchange/etc/server.properties:
# Pick a unique route, which will be configured consistently in apache
com.openexchange.server.backendRoute=DC1
# Clustered setups need to listen not only on localhost
com.openexchange.connector.networkListenerHost=*
/opt/open-xchange/documentconverter/etc/documentconverter.properties 
# TODO figure out the recommended Cache setup for a clustered scenario
com.openexchange.documentconverter.RemoteCacheUrls = ?
# Default is 3. Adjust for your sizing.
com.openexchange.documentconverter.jobProcessorCount=3

Services configuration:

systemctl start open-xchange-documentconverter-server
systemctl enable open-xchange-documentconverter-server

Configure the middleware to converter connectivity

So that was the trivial part. Now it gets interesting.

First test: direct connectivity

Pick a middleware node. Configure direct connection to one converter service which is called dc1. By default the service listens on port 8008 and listens on the path /documentconverterws.

# curl http://dc1:8008/documentconverterws/
<html>
<head><meta charset="UTF-8"><title>Open-Xchange DC</title></head>
<body><h1 align="center">OX Software GmbH DC</h1>
<p>WebService is running...</p>
<p>Error Code: 0</p>
<p>API: v5</p></body>
</html>

That's how it should look like. HTTP status code 200 (not shown for clarity, but you can verify with curl -v, "WebService is running...", "Error Code: 0".

If that is successful, go ahead and configure that URL in the middleware node

documentconverter-client.properties:
# this needs to be adjusted. will be discussed below.
com.openexchange.documentconverter.client.remoteDocumentConverterUrl=http://dc1:8008/documentconverterws

Do a testing cycle as described below.

If everything works, you confirmed that the middleware node and the converter node are configured correctly.

Now ensure all your middleware nodes and all your converter nodes are configured likewise. Make sure the converter nodes get unique com.openexchange.server.backendRoute values (see above).

Second test: use Apache for loadbalancing

Pick one frontend node to configure its Apache for loadbalancing for the converter nodes.

There are sample configuration stanzas in our default configuration. They are just fine. Just make sure the route parameters match the ones from the converter nodes.

I want to emphasize to configure the Allow line correctly. The /documentconverterws endpoint must not be made available publicly!

A sample Apache configuration looks like

<Proxy balancer://oxcluster_docs>
    Order Deny,Allow
    Deny from all
    # configure the allowed IPs such that only the middleware nodes are able to access
    # the /documentconverterws endpoint must not be made available publicly!
    Allow from 10.0.1
    BalancerMember http://dc1:8008 timeout=100 smax=0 ttl=60 retry=60 loadfactor=50 keepalive=On route=DC1
    BalancerMember http://dc2:8008 timeout=100 smax=0 ttl=60 retry=60 loadfactor=50 keepalive=On route=DC2
    # add further converter nodes, as many as you have
    ProxySet stickysession=JSESSIONID|jsessionid scolonpathdelim=On
    SetEnv proxy-initial-not-pooled
    SetEnv proxy-sendchunked
</Proxy>

ProxyPass /documentconverterws balancer://oxcluster_docs/documentconverterws

With that configuration in place, test the connectivity from the middleware node to that endpoint, e.g. (assuming the frontend node is called frontend1):

# curl http://frontend1/documentconverterws
<html>
<head><meta charset="UTF-8"><title>Open-Xchange DC</title></head>
<body><h1 align="center">OX Software GmbH DC</h1>
<p>WebService is running...</p>
<p>Error Code: 0</p>
<p>API: v5</p></body>
</html>

The response looks exactly like before. The difference is just that we access the service now via Apache.

Make sure you are actually getting responses from the different converter nodes by whatever means suit you (e.g. tcpdump on the converter nodes, looking at Apache's balancer-manager to verify the elected number increases equally for all converter nodes, etc.)


Testing

If you reconfigured something on the middleware node(s), restart the service there.

service open-xchange restart

Wipe out caches on the converter node(s) and restart the service there:

service open-xchange-documentconverter-server stop
rm -rf /var/spool/open-xchange/documentconverter/readerengine.*/*
service open-xchange-documentconverter-server start

(Careful about your paths and when copy-pasting; I don't take responsibility of you remove something wrong.)

If your testuser is logged in from a previous test, log out.

Login your testuser.

Access Drive. Upload test documents if not done before. Preferably some small trivial docs, also some large multi-page docs with embedded diagrams etc.

Switch to "icons" or "tiles" view. You should get nice preview thumbnails for each document. (This is only a relevant test for the first time since the previews are stored in the database. TODO: find out how to wipe / re-test that.)

Click the "eye" icon to start the Document Viewer. You should be able to view documents with reasonable speed. Scrolling through the pages should be possible and fast.

Click the "pop out" icon to the top right to use pop-out view. Verify all pages are rendered correctly in the full-page view and in the thumbnail previews.