| Commit message (Collapse) | Author | Age | Files | Lines |
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
invPMv null; PMVMatrix: Make Mvi, Mvit optional at ctor, add user PMv and PMvi - used at gluUnProject() ..
Matrix4f.mapWin*() variants w/ invPMv don't need temp matrices,
they also shall handle null invPMv -> return false to streamline usage w/ PMVMatrix if inversion failed.
PMVMatrix adds user space common premultiplies Pmv and Pmvi on demand like Frustum.
These are commonly required for e.g. gluUnProject(..)/mapWinToObj(..)
and might benefit from caching if stack is maintained and no modification occured.
PMVMatrix now has the shader related Mvi and Mvit optional at construction(!), so its backing buffers.
This reduces footprint for other use cases.
The 2nd temp matrix is also on-demand, to reduce footprint for certain use cases.
Removed public access to temporary storage.
+++
While these additional matrices are on demand and/or at request @ ctor,
general memory footprint is reduced per default and hence deemed acceptable
while still having PMVMatrix acting as a core flexible matrix provider.
|
|
|
|
|
|
|
| |
GraphUI.Shape: Efficiently reuse matPMv and temporary PMVMatrix storage
Reuse PMv in Shape.getSurfaceSize() and Shape.winToShapeCoord(),
for the latter we invert the reused PMv for mapWinToObj (i.e. UnProject).
|
|
|
|
| |
w/ Doxygen. Doxygen uses markdown
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Utilize Vec3f, Recti, .. throughout API (Matrix4f, AABBox, .. Graph*)
Big Easter Cleanup
- Net -214 lines of code, despite new classes.
- GLUniformData buffer can be synced w/ underlying data via SyncAction/SyncBuffer, e.g. SyncMatrix4f + SyncMatrices4f
- PMVMatrix rewrite using Matrix4f and providing SyncMatrix4f/Matrices4f to sync w/ GLUniformData
- Additional SyncMatrix4f16 + SyncMatrices4f16 covering Matrix4f sync w/ GLUniformData w/o PMVMatrix
- Utilize Vec3f, Recti, .. throughout API (Matrix4f, AABBox, .. Graph*)
- Moved FloatUtil -> Matrix4f, kept a few basic matrix ops for ProjectFloat
- Most, if not all, float[] and int[] should have been moved to proper classes
- int[] -> Recti for viewport rectangle
- Matrix4f and PMVMatrix is covered by math unit tests (as was FloatUtil before) -> save
Passed all unit tests on AMD64 GNU/Linux
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
for fair and realistic numbers - Both mul() ops faster than FloatUtil
Enhanced invert() of Matrix4f* and FloatUtil: Use 1f/det factor for burst scale.
Enhanced Matrix4f.invert(..): Use factored-out mulScale() to deliver the scale,
giving a good 10% advantage on aarch64 and amd64.
Brings Matrix4f.invert(..) on par w/ FloatUtil, on aarch64 even a 14% advantage.
+++
TestMatrix4f02MulNOUI added an additional Matrix4f.load() to the mul(Matrix4f) loop test,
which surely is an extra burden and not realistic as the mul(Matrix4f, Matrix4f) and FloatUtil
pendants also don't count loading a value.
Matrix4f.mul(Matrix4f) shall be used to utilize an already stored value anyways.
Matrix4f.mul(Matrix4f) didn't really exist in FloatUtil.
Same is true for Matrix4f.invert(), re-grouped order, i.e. pushing the non-arg variant last.
+++
Revised performance numbers from commit 15e60161787224e85172685f74dc0ac195969b51
AMD64 + OpenJDK17
- FloatUtil.multMatrix(a, a_off, b, b_off, dest) is considerable slower than all
- Matrix4f.mul(a, b) roughly ~10% faster than FloatUtil.multMatrix(a, b, dest)
- Matrix4f.mul(b) roughly ~18% faster than FloatUtil.multMatrix(a, b, dest) (*)
- Matrix4f.invert(a) roughly ~ 2% faster than FloatUtil.invertMatrix(..)
- Matrix4f.invert() roughly ~ 4% slower than FloatUtil.invertMatrix(..) (*)
- Launched: nice -19 scripts/tests-x64.sh
RaspberryPi 4b aarch64 + OpenJDK17
- FloatUtil.multMatrix(a, a_off, b, b_off, dest) is considerable slower than all
- Matrix4f.mul(a, b) roughly ~ 9% faster than FloatUtil.multMatrix(a, b, dest)
- Matrix4f.mul(b) roughly ~14% faster than FloatUtil.multMatrix(a, b, dest) (*)
- Matrix4f.invert(a) roughly ~14% faster than FloatUtil.invertMatrix(..)
- Matrix4f.invert() roughly ~12% faster than FloatUtil.invertMatrix(..) (*)
- Launched: nice -19 scripts/tests-linux-aarch64.sh
(*) not a true comparison in feature, as operating on 'this' matrix values
for one argument, unavailable to FloatUtil.
Conclusion
- Matrix4f.mul(..) is considerable faster!
- Matrix4f.invert(..) faster, esp on aarch64
And additional Matrix4fb tests using float[16] similar to FloatUtil
also demonstrates less performance compared to Matrix4f using
dedicated float fields.
|
|
|
|
| |
15e60161787224e85172685f74dc0ac195969b51
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Ray, AABBox, Frustum, Stereo*, ... adding hook to PMVMatrix
Motivation was to simplify matrix + vector math usage, ease review and avoid usage bugs.
Matrix4f implementation uses dedicated float fields instead of an array.
Performance didn't increase much,
as JVM >= 11(?) has some optimizations to drop the array bounds check.
AMD64 + OpenJDK17
- Matrix4f.mul(a, b) got a roughly ~10% enhancement over FloatUtil.multMatrix(a, b, dest)
- Matrix4f.mul(b) roughly ~3% slower than FloatUtil.multMatrix(a, b, dest)
- FloatUtil.multMatrix(a, a_off, b, b_off, dest) is considerable slower than all
- Matrix4f.invert(..) roughly ~3% slower than FloatUtil.invertMatrix(..)
RaspberryPi 4b aarch64 + OpenJDK17
- Matrix4f.mul(a, b) got a roughly ~10% enhancement over FloatUtil.multMatrix(a, b, dest)
- Matrix4f.mul(b) roughly ~20% slower than FloatUtil.multMatrix(a, b)
- FloatUtil.multMatrix(a, a_off, b, b_off, dest) is considerable slower than all
- Matrix4f.invert(..) roughly ~4% slower than FloatUtil.invertMatrix(..)
Conclusion
- Matrix4f.mul(b) needs to be revised (esp for aarch64)
- Matrix4f.invert(..) should also not be slower ..
|
| |
|
| |
|
|
|
|
|
|
|
|
| |
animation etc
Implementation borrowed my 'gfxbox2' C++ project
<https://jausoft.com/cgit/cs_class/gfxbox2.git/tree/include/pixel/pixel3f.hpp#n29>
and its layout from OpenAL's Vec3f.
|
|
|
|
| |
OutlineShape.Visitor, allowing to use the Glyph (information).
|
|
|
|
| |
GLAutoDrawable.invoke(..) API doc: Add semantics about GLRunnable return value.
|
|
|
|
|
|
| |
90a95e6f689b479f3c3ae3caf4e30447030c7682
A null buffer is possible in case initialElementCount at ctor is <= 0
|
| |
|
| |
|
|
|
|
| |
API doc
|
|
|
|
| |
useProgram() only throw exception if 'on==true' is requested (disabling after delettion is OK)
|
|
|
|
| |
dump{Shader->}Source(), refine string output.
|
|
|
|
| |
infinite dimension
|
|
|
|
| |
explicitly to set the name upfront, clarifying workflow. Impl: ImageSequence + GLMediaPlayerImpl
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
showing test-texture. Adding stop(); (API Change)
- allow multiple initGL(..) @ uninitialized and initialized
- allows usage before stream is ready
- using a test-texture @ uninitialized
- adding stop()
API change
- initStream() -> playStream()
- play() -> resume()
FFMPEG: Added 'ready' check for robustness
|
|\
| |
| | |
Fix for AWT GLCcanvas DPI scaling. Forum thread https://forum.jogamp…
|
| |
| |
| |
| | |
https://forum.jogamp.org/DPI-scaling-not-working-td4042206.html
|
| | |
|
| |
| |
| |
| | |
Consider applying it in default chooser?
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
undesired (Graph VBAA + MSAA); Add NonFSAAGLCapabilitiesChooser
Notable: On RaspiPi4b w/ Mesa3D's Broadcom/VC driver,
the chosen capabilities is a multisamnple one even though not requested.
This causes
- extra performance overhead
- doubled AA: 1st our VBAA, then the FSAA (multisample) -> loss of sharpness
Simply dropping the undersired FSAA helps and ups performance
on the Raspi board (22 -> 35 fps).
|
| | |
|
|/
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
used for getElemCount() instead of 0==position, ... (API change)
API Change
- sealed() moved up from GLArrayDataEditable -> GLArrayData
- GLArrayDataWrapper is sealed by default
- getSizeInBytes() -> getByteCount()
- Semantics of getElemCount() and getByteCount()
- Correctly use sealed() to switch from position to limit - instead of 0==position
Aligned method names:
- getElemCount()
- elemPosition()
- remainingElems()
- getElemCapacity()
to corresponding byte counts:
- getByteCount()
- bytePosition()
- remainingBytes()
- getByteCapacity()
|
| |
|
| |
|
| |
|
| |
|
|
|
|
|
|
|
|
| |
variants for flexibility/performance
Notable: The array-put is slower than small range single-puts, e.g. put3i(..).
Uses GlueGen's Buffers change commit 69b748925038b7d44fa6318536642b426e3d3e38
|
|
|
|
| |
createDummyDrawable(..)
|
|
|
|
| |
is not orig-owner
|
|
|
|
|
|
| |
dropping its usage (GLArrayDataWrapper validation)
Skip GLProfile based index, comps, type validation, might not be future proof.
|
|
|
|
| |
additional components ...
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
finalize immutables, add growthFactor (default golden ratio 1.618), add getCapacity*() and printStats(..)
The growthFactor becomes essential for better growth behavior and can be set via setGrowthFactor().
The other changes were merely to clean up the GLArrayData interface and its 4 implementations.
Not great to change its API, but one name was misleading ['getComponentCount' -> 'getCompsPerEleme'],
so overall .. readability is enhanced.
Motivation for this change was the performance analysis and improvement of our Graph Curve Renderer.
|
| |
|
|\
| |
| | |
Add missing case in getDbgSeverityString()
|
| | |
|
|/
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
rules for OutlineShape and add get/setWinding in Outline
Loop.initFromPolyline()'s Winding determination used a 3-point triangle-area method,
which is insufficent for complex shapes like serif 'g' or 'æ'.
Solved by using the whole area over the Outline shape.
Note: Loop.initFromPolyline()'s Winding determination is used to convert
the inner shape or holes to CW only.
Therefor the outter bondary shapes must be CCW.
This details has been documented within OutlineShape, anchor 'windingrules'.
Since the conversion of 'CCW -> CW' for inner shapes or holes is covered,
a safe user path would be to completely create CCW shapes.
However, this has not been hardcoded and is left to the user.
Impact: Fixes rendering serif 'g' or 'æ'.
The enhanced unit test TestTextRendererNEWT01 produces snapshots for all fonts within FontSet01.
While it shows proper rendering of the single Glyphs it exposes another Region/Curve Renderer bug,
i.e. sort-of a Region overflow crossing over from the box-end to the start.
|
|
|
|
|
|
|
|
|
| |
system.
- All pixelSize metrics methods are dropped in Font*
- TypecastGlyph.Advance dropped, i.e. dropping prescales glyph advance based on pixelSize
- TextRegionUtil produces OutlineShape in font em-size [0..1] added to GLRegion
- Adjusted demos
|
|
|
|
|
|
| |
Add Path2F addPath(..), emphasize required Winding.CW
GPURegionGLListener01 used by TestRegionRendererNEWT01 covers Path2F CCW and CW (reverse add) methods.
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
GLContext.isGL3bcAvailable() check
Julien Gouesse resolved this odd issue, where a requested GL2 profile was mapped to GL3bc but is not implemented,
see <https://forum.jogamp.org/InternalError-XXX0-profile-2-GL2-gt-profileImpl-GL3bc-not-mapped-td4041754i20.html#a4042018>.
I exploded his patch a little to reuse the GLContext.getAvailableGLProfileName() result
and simplify the conditional statement.
This might need more testing perhaps, plus analyis why GLContext.getAvailableGLProfileName()
offers GL3bc but is not available via GLContext.isGL3bcAvailable() check.
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
native DPI toolkit aware platforms (Linux, Windows)
NEWT + NewtCanvasAWT:
Maybe create "interface ScalableSurface.Upstream {
void pixelScaleChangeNotify(final float[] curPixelScale, final float[] minPixelScale, final float[] maxPixelScale); }"
to allow downstream to notify upstream ScalableSurface implementations like NEWT's Window to act accordingly.
+++
AWT GLCanvas: Add remark where to add the potential pixel scale.
|
|
|
|
| |
setBounds(..)
|