Spotfire Information Links and the entire Information Modelling layer in Spotfire is a great way to democratize your data.
Information links become great
starting point for your users to build their analytics without even knowing SQL or where the server is, what table to use and how are they joined and all the geeky stuff. In a typical instance
there is a high likelihood that some of the information links are shared
between different reports.
With new versions of Spotfire now there is an option available
to cache Individual Information links.
The data behind the information link is
cached at the Spotfire Server level. The cache is a Spotfire specific binary
file which is stored on the Hard Drive of your Spotfire server. So you need to
ensure that if you are caching there is enough space.
The cache itself is self cleaning
depending on timeouts and validation query. Also a Spotfire server Restart
would clear up the cache, so you need to accommodate for that in your planning.
Caching is generally good for non volatile
and non-personalized data, so ensure that you don’t use caching for the wrong
use case.
Obviously for the cache to build the query needs to be run at least
once.Few options are available here
1) First
user takes the hit.
Basically you
set up the caching option and then analysis are used without any additional steps.
First user takes the hit and waits for the query to execute. Subsequent reports
that use the same information links can then get the data from the spotfire server cache if it
meets the time or validation query condition.
2)
Using Automation Services and fixed
schedules
Lets say you
know your data would be ready in the database everyday at 4:00 am and you
want to cache the data at that point. You could simply create an Automation
task, that uses the Spotfire “Open from Library” Task. Open any file that uses
the information link that you want to cache.
Then schedule
this task to run at 4:00 am. The job doesn’t really do much than opening a file
and hence triggering an Information Link. Then depending on the validation
query and/or timeout settings the actual end users when using any report that
uses the information link would get the data served from Spotfire server cache.
3)
Using Automation Services and triggers
This is
similar to option 2, but instead of using time based scheduling you can trigger
an automation job by running a bat file programmatically or even calling Spotfire Automation
services webservices. This way if you have an ETL process or some other process that needs be completed and the completion times are not fixed then you can trigger as needed.
4)
Using Scheduled Updates
Scheduled updates
is a way to cache Spotfire files and data on the webplayer server for quicker access
You can find more details of scheduled
updates in the installation manual for the webplayer.
6.0 Installation manual link
6.0 Installation manual link
Basically if
you schedule at least one file using the information link you want to cache
then this triggers loading of the information link at specific times and hence the caching. Scheduled updates
can be both time based and event based. More details can be found in the Installation manual and here
Considerations
Caching can help overall performance and significantly reduce the
hit on your database if done right. Please consider the following factors while using caching.
1) Enough Disk Space is available
2) The benefit of caching would come from at least
one report using the IL been completed loaded. So to ensure that
subsequent reports use data from the cache you should accommodate the time
needed for the cache to build and the report to load.
E.g. If a
particular report needs 15 minutes to load then the subsequent reports that use
the same information link, should be triggered after 15 minutes.
3) Caching does not necessarily mean ability
to handle enormous data, all good practices like Data-On-Demand, InDB connectivity
should be considered. The great part about Spotfire is you can have a very
hybrid solution and build a really great solution.
4) Also ensure that you have enough connections set for datasources you are using and bandwidth to run queries which you plan to cache. E.g Trying to cache 15
queries at same time may not be as fast as doing them in chunks of 5. This
depends on network, database server load,Spotfire server specs webserver specs and several other factors so you
should check what works best for your environment.
Great stuff
ReplyDeleteGreat Blog Thank you so much for sharing this.
ReplyDeleteHikvision Full HD & Night Vision Camera Kit